; A lage Banble Mane Pen Press at ss Mh dah is Sec a9 
? " Popilatione with Known Varianoséy ‘Roper ©. Epcumoven..y....... 1655) 


40 
au Atdots oh a bikes odck WEA esa E.J.Guueex... 76 
Universal Bounds for Mean Rangé and Extreme Observation. H..0. Hist 

, 85 


a REY AND H. A. DAVID... . 5 cyuivewrevecde, sectbcoeesosrceperavarserns 
1 Seve Tenarvies fot Pystially Bolasned Dealony Ww. 8, Gonna np W. ce 
“4 OLATYORTHY is. e ese. ee. sige de teria: 9 be alee ey 

On the-Reduoed Meigint Problem. Saige B. Keasas,......0... 

\ Spacing of Information in Polynomial; legression. A. ’ 
‘Generalization of the Theorem of Glivenko-Cantelli. J. Wotrowrrs, «. 5. ist 
On Lehmann's Two-Sample Test. ‘R. M. Sunpapm.......:.. pone Sages ce 
“Tables for » Nonparametric Test of Location. ‘8. Rosensaum. . ails 

On the’ Problem of Construction of Orthogonal. Arrays. Estuee Setben.. 
Audmissible Tests for the Mean of a neesseenen ee ‘Autan Bonn. 

: BAUM 0 Soe e es te 

Notes: * 


Tig ona of tli tp oe! By Peni 


Wiiasem Kave ORE ES EG a 

An Extension of the Borel-Catitelli Lemma, Sranixy W. meen 
The Distribution of Radial Bemr.: ‘Hunsone, Wen... ...- Us 
De ptape ae acai 


Report of the Washington Meeting........ s..: boas 
Report of Annual Membership Meeting, Cladiaikte Wo, 1988... 
Raport of tr: President of the Institute’... . :, 
_. Report of the Becretéry-Treseuter of the TricLigute. .. 
© Report of the Kditor of the Anal: sim ee 





: 


hd 





THE ESTIMATION OF BIOLOGICAL POPULATIONS' 
By DoveGias G. CHAPMAN 
University of Washington 


Summary. A number of statistical models, underlying the methods used in 
the estimation of the sizes and other parameters of animal populations, are set 
up. The relevant estimation equations are given, with their variances and co- 
variances. For the most part the theory is designed for large populations. In 
setting up the models, consideration has been given to the desideratum of hav- 
ing them conform as closely as possible to the actual practices of animal sampling. 
To what extent the models do agree with reality is one of the many open ques- 
tions which are noted in this paper. 


1. Introduction. The use of sampling methods in the enumeration of popula- 
tions has become widely known and widely accepted only within the past genera- 
tion. Yet it is easily perceived that total enumeration methods fail for all but 
the simplest of populations. Particularly is this true of biological populations 
which may be mobile in space, transient in time and difficult of access. The 
changes in space (immigration and emigration) and in time (recruitment and 
mortality) must often be evaluated to determine the total population size and 
in any case these changes are usually of interest in their own right. 

In this survey, only those methods are considered for which it is possible to 
set up a reasonable statistical model and for which it is possible to assess the 
sampling errors. Attention is limited to methods that lead to absolute rather 
than relative estimates. Little work has been done to set up statistical models, 
as a basis of relative estimates, though for an important exception, attention is 
called to a paper of Neyman [22]. To give unity to this survey, only those methods 
that have been used in the study of macroscopic mobile populations are discussed. 

Fixed sample methods have been used for the most part in the enumeration of 
other biological populations. However, even the enumeration of sessile popula- 
tions can give rise to new statistical problems; many of these are noted in an 
important study of statistical problems in ecology, that recently has been initi- 
ated by Skellam [28]. A further reference in this field is to a paper by Hoel [13]. 


2. Tag-and-sample estimates: direct random sampling. When the population 
structure is undefined and unknown, it is not possible to select a fixed sample, 
as is the case say in ecology or in sampling human populations. The origin of the 
idea of using an associated variable of known distribution to build up a sample 
count into a total population is difficult to trace. Certainly Laplace [17] was 


Received 7/16/53. 

} Presented as a special invited address at the joint meeting of the Institute of Mathe- 
matical Statistics and the Biometric Society (WNAR) at Stanford, California, June 19, 
1953. 

2? Work done under the sponsorship of the Office of Naval Research. 

1 





2 DOUGLAS G. CHAPMAN 


among the first to study the method carefully. He suggested determining the 
population of France from the known number of births in all parishes and from 
the fact that the ratio of births to total population. could be determined for some 
parishes. Petersen [23], a Danish biologist, first developed the procedure of 
marking fish to assist in studying their movements, migration, etc. He later came 
to realize that the marked fish could play the same role for his populations as 
the births did for Laplace—though evidently he was unaware of Laplace’s work. 

When a mathematical model is set up to formalize this intuitive approach, it 
is unusual to assume random sampling (i.e., sampling such that the properties 
“being tagged” and “‘being sampled”’ are independent). It is much easier to make 
this assumption than to verify it. It is also standard to assume that the numbers 
tagged and the numbers sampled are parameters at the disposal of the experi- 
menter. A completely adequate model must take into account the birth rate 
with possible lag effects, a changing death rate, as well as emigration and immigra- 
tion over the period during which repeated tagging and sampling take place. It 
is apparent that the number of unknown parameters is large and that such a 
model must be indeed complex. Some simplifying assumptions are desirable. 

The following model is not the most general possible; it does, however, cover 
many of the situations that have been studied and it leads to simple estimation 
procedures. It applies specifically to large populations and it is further assumed 
that either there are no new recruits to the population (through birth or immigra- 
tion) or that new recruits are distinguishable and may be eliminated from the 
samples. 


Model I 


Unknown 
parameters 


P = probability that an animal alive at time ¢ survives and re- 
mains in the population at time ¢ |} 1. 


; = number of animals tagged at the ith tagging, taking place 
Known 


parameters 


t 
at time a,(i 1,2,3--+ m), 

n; = number of animals sampled in the jth sample taken at time 
b(j = 1,2,3---7r). 


i = total population size at time zero, 
1 
| 


recovered in the jth sample, 
number of animals originally tagged at the ith tagging 
available for recovery at the time of the jth sample, 
.N; = population size at the time of the jth sample. 


Random 


[2 number of animals originally tagged at the 7th tagging and 
variables | 


f(i) = the smailest value of 7 such that animals tagged at the ith 
tagging have a positive probability of being recovered 
in the jth sample. 

The event of survival is assumed to be independent from animal to animal. 
For large No it may be assumed that given r,;, N;, (which are not observable 
r.v.) a; has a conditional Poisson distribution with expectation (njr,;)Nj~*. It 





BIOLOGICAL POPULATIONS 


then follows that, for large No , and either ¢,; or n; large, x;; has approximately a 
Poisson distribution with 


(1) 


More precisely this holds as a limiting result as No and the ¢; or n; — «© while 
all tin;/No remain finite. 
For, defining 


_ Nit ry _ 


Yij 3 
tii No 


it is seen that 


g[e7* = fei 4si] ~ Il Il fetens Pee Me (efi 1}. 
t.-¢ 


Since, as t; + ©, and Ny —> «, plim N; = NoP” and p lim 7; = t,P’*~*, the 
stated result immediately follows from a theorem of Mann and Wald ({31] 
Theorem 5 and Corollary 2, pp. 223, 224). 

Furthermore since ¢; and n; enter symmetrically (the sampling may be regarded 
as the “tagging” and reciprocally) it follows that the same limit distribution 
holds when Ny , n; — «©, with tm;/No remaining finite 

With this approximation it is straightforward to set up the maximum likeli- 
hood equations for No and P, namely, 


> yt t;n, P™** 


(2) N, = Sto 
Z.- 


a (EX amb) = (EF ass) (EF tan P™*) 


\t—el jee f(i) tol joe f(i) inl joes (4) 


m 


where x. = ny Desc) ti; . Equation (3) is a polynomial in P that can be 
solved by the usual methods. 


The inverse of the asymptotic variance-covariance matrix of N and P is 


m r 


N;° : 7 tiinjsP**, N3? > e a;tny Pe 


/ tl jm f(t) im] jmfit) 
(4) 


m r 


N>° > 7: a;tnyP~“**”, No’ + a ai tn; Po | 
tml jemf(i) tml jems(i) 

It is convenient to display the parameters (¢; , n;) and the observations (z;,;) 
of such a census in a triangular array—the so-called “trellis diagram” used by 
Dowdeswell, Fisher and Ford [10], but much more thoroughly studied by Leslie 
and Chitty [19] and by Leslie [18]. Model I departs primarily from that proposed 
by Leslie and Chitty in ignoring multiple recaptures. Leslie and Chitty show this 
represents a loss of information; for large No , however, the expected number of 





+t DOUGLAS G. CHAPMAN 


multiple recaptures is very small. In fact if this is not so, it suggests that the 
stochastic variation of r;; and N; may no longer be negligible. Moreover the 
multiple recaptures are often those most suspect from the point of view of ran- 
domness of the sample. 

Leslie and Chitty, in common with other investigators, assumed that mortality 
and emigration are strictly deterministic. Thus they are able to write down the 
expected values of the various classes of tag recoveries as polynomials in P, and 
to assume a multinomial distribution for these tag recoveries. The maximum 
likelihood equations can then be formulated, though the solution of the equations 
can, in general, be accomplished only by iterative methods. They have etudied a 
large number of problems in this manner and reference should be made to their 
papers for models appropriate to situations not considered here. A model based 
on the Poisson distribution can also be set up for most of these situations, which 
will be valid for large Ny , even though space and time variations are stochastic 
variables, and which will often lead to simpler estimation equations. A complete 
treatment, considering this stochastic variation, has not been given for the case 
of small or moderate sized populations. 

The formulae given above easily specialize to Jackson’s “negative’’ census 
[14], (one in which several taggings are followed by a single sample, at which 
time only, are tag recoveries noted). Bailey [1] has given the maximum likelihood 
estimates and their asymptotic variance-covariance matrices for Jackson’s 
various census schemes assuming deterministic birth and death rates. Jackson 
also set up a “‘positive’’ census scheme, which he used to estimate the rate of 
recruitment. 

By defining a parameter B as the probability that an individual alive at time 
t adds a new individual to the population by time ¢ + 1, and assuming that this 
event is independent of the event of survival, the model outlined above may be 
extended and the restrictions of no recruitment may be removed. The {z;;} 
still have a Poisson distribution to the same approximation as before and the 
maximum likelihood equations for Ny , P and B are easily written down. The 
two equations involving P and B are polynomials jointly in P and B. However, 
it seems hardly realistic to assume that the recruitment rate is proportional to 
the population size or that it is independent of survival. Another approach is 
noted later. 

Another specialization of formulae (1) to (4) is to put P = 1 that is, assume 
mortality can be neglected. This situation is familiar to fishery biologists as a 
Schnabel type census, named for the person who published a mathematical 
theory of estimates based on such a multiple census [27]. More precisely, as 
noted by the author in [6], for large N, , 


(6) ) D Zz; t; nj 


No - tl jmf(i) 


2. a i 


is approximately unbiased with standard deviation given by 





BIOLOGICAL POPULATIONS 


(7) no nl DD LL tin, 

FF uni +0) eee 

i=l j=f(i) ‘ 
Also confidence limits for the Poisson parameter will yield confidence limits for 
N, in this case—see, for example, [4]. 

In the usual Schnabel census, tagging is carried on simultaneously with the 
sampling process. More precisely, after each sample is examined the untagged 
individuals are tagged and then all are returned to the population. If this is 
strictly followed, 4; = ni — ja s;; and hence the ¢; are random variables. 
For large No , the random variation of the ¢; may be neglected. In fact it has 
usually been disregarded in any case. 

It is apparent that there may have to be some restriction on m and r to make 
the results given above meaningful. In particular ifm = r = 1, no estimation 
of the parameters N> and P is possible, but estimation of No is possible if a; = 0. 
This is the simple Petersen situation—a single tagging followed by a single 
sample. The formulae in this case are seen not to depend, for large Ny , on mor- 
tality assumptions. The variance of Ny = (n + 1) (t + 1)/(x + 1), the almost 
unbiased estimate of No , is given by 


2 _ a2| No (™*) | No 
(8) o¥, os No | + 0 caf — P . 


That this is a function of P, the survival factor, may be disregarded for most 
practical purposes. 


3. Tag-and-sample methods: inverse sampling. A modification of the sam- 
pling procedure outlined above has been developed by Bailey [1], Goodman [12] 
and the author [6]. If the number of tags to be recovered, rather than the sample 
size, is predetermined, estimates are obtained which are somewhat simpler and 
slightly more efficient. The most interesting of these results is that due to Good- 
man, who considered a multiple sample type of census for a situation where there 
is no recruitment and P = 1 (such a population will hereafter be referred to as 
closed). His procedure is sequential in that the decision to stop sampling is a 
consequence of the observations. 


Model II 


Unknown 
parameter 


\No = population size. 


(n, = the predetermined sequence of samples, 

T; = the number of tagged individuals in the population at the 
time the ith sample is taken (7';, the cumulative total is 
to be distinguished from ¢; , the number of tags put out 
in the ith tagging.) (¢ = 1, 2,3 --- .), 

x = the predetermined number of tagged members to be recovered 
before the sampling experiment stops. 


Known 
parameters 





DOUGLAS G. CHAPMAN 


= the number of samples taken before the z tagged individuals 
are recovered, 
variables . 

r= int Ni. 


Random 


Sampling is assumed to be random with respect to tagged and untagged in- 
dividuals. Then for large No , 


Pr(r samples are required to obtain z tagged members) 
z 
z Pr[xz — j tags are recovered in first r — 1 samples] 
j=l 


-Pr| 7 tags are recovered in the rth sample] 


r—1 zj 
pr) (Lx) ee 


~a<i~=n} | , r 


(x — j)! 


po Sia (= Ms) 7 ( 


zx! 
where we have written \, for n;7';/N. Making the change of variable 


‘ r ‘ 
u=2> irks, Au =22,, 


rb z~1 
u /2 u ] Au 
(10) Pr(a < u <b) = = eee | (x) . ——__—— - — + 0(Au) |. 
rT 4 2} (x — 1)! 2 
Let No — ~ in such a way that A; — 0 while oa 4; > 0. Using Duhamel’s 
lemma it can be shown that 
/ . > l E —u/2 2-1 
(11) lim Pr(a < u < b) = —— | i gee 
No-*2 27T'( ) a 

. _ aon! r -~ st tk S20 4 . 9 . 
that is, w = 2N~’ 3°, n.7'; has a limiting x’ distribution with 2x degrees of 
freedom. It follows that the (asymptotic) minimum variance unbiased estimate 
of No is 


> nT; 
=i 
R00 S 

x 


(12) “ 


with 


The proof given above differs from that of Goodman: he considered the Schna- 
bel type of census where tagging and sampling are performed in the same opera- 
tion, that is, all untagged individuals are tagged before the sample is returned to 
the population. What he showed, namely that n°/N has, asymptotically, a x° 
distribution with 2x d.f., is equivalent to the above result. In this case it is 
simple to find the average sample size, that is &(n), (for large No). For these 
results and other exact sample results reference is made to Goodman’s paper 
cited, [12]. 





BIOLOGICAL POPULATIONS 7 


The simplicity of oy, may make it particularly useful in designing the sample 
census. Up to the moment, however, the several inverse sampling schemes pro- 
posed have not been tried out. How to choose the sequeice ,n;} in an optimum 
manner remains an open question. Nor has any attempt been made to set up % 
theory of inverse sampling for other than closed populations. 


4. Tag-and-sample estimates: regression approach. The assumptions under- 
lying Model I may fail for a variety of reasons—imperfect sampling, clustering 
of the populations, variation over the populations and over time, of the mortality 
(or emigration) rate, etc. In view of the considerable superimposed variabiliv: 
that may thus exist, in addition to strictly multinomiz: (or Poisson) variation, 


it is pertinent to ask whether a linear regression mode! might not be more ap- 
propriate. 


Vodel ITT. The same notation as Model I is needed. However, the restriction 
that there be no recruitment may be removed. Hence it is more reasonable to 
regard N, , No, --+ N, as unknown parameters to be estimated. Furthermore 
the definition of P can be extended as follows: P = the average probability 
that an individual alive at time ¢ survives and remains within the population to 
time ¢ + 1. 

If the sampling is such that 


(13) 
it follows that 


(14) 


(15) 


and that In (2;; + 1) has a constant variance (approaimately). 

The factor (x,; + 1) is suggested by the fact that the reciprocal of a binomial! 
or Poisson r.v. plus one is an (almost) unbiased estimate of the reciprocal of the 
parameter. Moreover such a device avoids the difficulties of occasional zeros— 
care must be exercised if the zeros are numerous or in sequence, for the assump- 
tion above may then be clearly invalid. The logarithmic transformation is sug- 
gested by the product nature of equation (14). However, it is also true that the 
variance of the logarithm of a variable that is distributed according to the Pois- 
son law is constant up to terms of order \ ' Furthermore, the logarithmic 
transformation has been extensively used in analysing data obtained from pelagic 
haus or catches (cf. e.g. Winsor and Clark [32]). 

Best linear unbiased estimates of In P and In N; are found by the least squares 





8 DOUGLAS G. CHAPMAN 


method (under these assumptions). From these, estimates of P and N;; are ob- 
tainable which have optimum asymptotic properties though not necessarily 
optimum small sample properties. Interval estimates may also be obtained by 
postulating approximate normality of the In (z;; + 1). Such interval estimates 
may be much more realistic than those based on Model I, if there is in fact 
superimposed variability due to the causes indicated or to other causes. 

Model III represents in a sense an omnibus model. It has the advantage that 
an estimate of the extraneous variation can be made from the observations. On 
the other hand, it is imprecise and heuristic rather than rigorous. If the hetero- 
geneities noted can be carefully assayed, if not controlled, it may be possible to 
set up a model which has this advantage and is at the same time more exact. 

This type of approach would give some flexibility to the assumptions underly- 
ing Jackson’s positive census (where a single tagging is followed by a sequence of 
samples) or more generally to the “‘trellis diagram’ census scheme where re- 
cruitment is to be taken into account by a single parameter. Redefining B as 
the average probability that an individual within the population at time ¢ adds 
a new recruit to the population at time ¢ + 1, similar assumptions as those 
above lead to 

| fe.) = emer 
(16) E(ay) = No&P +B)’ 
Hence estimates of P, B and N» could be derived from the least squares esti- 
mates of In P, In (P + B) and In No from the equation: 


inj; 


(17) e(In — 


“ij 


:) = (a; — bj) InP + In No + by In (P + B). 


5. Dichotomy methods. A method of estimating population size that has been 
used in wildlife research and which may be useful in other fields, is based on the 
change of sex ratio caused by a selective kill. The sex ratio is determined before 
and after the kill by sampling methods. Several references to field applications 
of the method are listed by Scattergood [25] in a general survey of methods of 
population estimation. 

The estimation procedure may be based upon any dichotomy within the popu- 
lation, or even on external factors: all that is required is a sampling process 
followed by a selective removal of individuals from the population, and subse- 
quently a further sampling process. Closed populations only will be considered. 


Model IV 


. N; = population at time ¢; (¢ = 1, 2) made up of two classes 
Unknown a , 
4 X and Y, 


parameters | ,, > : - . 
' \X,, Y; = size of classes X, Y at times ¢; . 


Known )n; = size of random samples taken at time ¢, , 
5 , , , , 
parameters r= X, — X2; y= } 7 Y, ‘r=. + Ty - 


2) 





BIOLOGICAL POPULATIONS 


f oi 
Random x; = number of elements of class X in sample n; , 
variables = number of elements of class Y in sample n; , (¢ = 


Assuming sampling with replacement 


-({™ my ib =)" 7 (= “( ss = 
(18) p(x, X2) - oe (1 N a) *) 1 N: ° 


Since it is assumed that X_ , N2 are expressible in terms of X, and N; and known 
parameters, estimates of X, and N, are easily found. The moment estimates 


(19) x, a x(n Tz — XP) 
NeXT, — MZe 


(20) N, = Mrs — mr) 
NeaX, — MTZ 


are also maximum likelihood estimates. 


Formulae for the inverse of the asymptotic variance-covariance matrix are 
as follows: 


ny - Ne mM Ne | 

(2) X, Y; Xo Y» N, Y; N;2 Y2 | 
\e1) 
nm Ne m Xi , me Xo | 


Ni¥, ™:¥: MtYi* MY,! 


so that 


P? Xe Y2 P} X; Y; 
NM + ny 


Pim Py 


(22) ox, (asymptotic) = 


where ;= (¢ = 1, 2). 


These formulae may be used to determine the optimum theoretical allocation 
of sampling between the before and after samples. It is also interesting to use 
them to compare the effort required for this type of census with that required for 
tag sample methods. A numerical study shows that the tag sample method has 
the advantage—assuming that the tags are sampled by the removal process. 
However, the evaluation is incomplete without some means of determining the 
relative costs of sampling and tagging. Moreover, it is reasonable to suppose 
that the assumptions underlying the dichotomy method are more likely to be 
fulfilled than in the tag-sample method—questions of tag mortality and differ- 
ential recapture rates do not arise in the former process. 

In some situations it may be possible to sample two populations, for example, 
a sport fish and a scrap fish. The sports fishery then serves as the selective 
removal factor in a very favorable situation since r, will be zero. In this case X 
is the parameter that it is of interest to estimate. 





10 DOUGLAS G. CHAPMAN 


The method may also be applied where the removal is done by the sampler. 
In this case it is more realistic to assume that a succession of samples are taken. 
Again it is straightforward to set up the model for this situation and to derive 
the maximum likelihood equations for X and '. This naturally suggests a 
sequential estimation procedure where the decision to stop is determined by the 
sample results. 

If there is dilution or elimination, the procedure is obviously vitiated. As yet 
no work has been done to extend the method to estimate these factors. Esti- 
mates of mortality for example might be based on a trichotomy or on an inter- 
mediate sampling during the removal process. The several sample scheme (se- 
quential or not) would lend itself to this more complicated situation. 


6. Methods based on the notion of effort. That the amount of effort expended 
in obtaining a given sample of a population is proportional to the population 
density has long been the basis of relative population estimates. Leslie and 
Davis [20] and independently DeLury [7] showed how absolute estimates could 
be determined from this information, when the successive samples are removed 
from the population—as for example occurs in the catch of a fishery. Except for 
this catch, the population is assumed closed. A model similar to DeLury’s is 
as follows: 


Model V 


(No = initial population size, 


‘ ‘k = average probability that an individual is captured by one unit 
parameters | 


Unknown 


of effort in any time interval. 


Known ; ‘ 

; total catch up to but not including the ‘th interval, 
parameter 

Random of catch per unit of effort during the ¢th interval (¢ = 
variable \ 1, 2, -+> m). 


If the units of effort are independent it follows that &(C,) = k(No — K,). 
With the further assumption that oc, is approximately constant (which is reason- 
able for large No unless the cumulative catch represents a large segment of the 
population by the end of the experiment), least squares estimates of k and N, 
may be found. In particular 


e Z. (K, -~ Ry’ 

(23) ] K —- —2_____. 
> CAK, — R) 
te=1 


If the further assumption of approximate normality of the C, is made, con- 
fidence intervals for N are 


(24) 





BIOLOGICAL POPULATIONS 


where 7; 72 are the roots of the equation, 


¥ ((eK. — KR)? —¢o(K. — Ky’) — 26 (XK. — RY [EiCu(K. — By 


2 
~ (¢ — TOK, — R)’)? =0 


m 


(25) 


where q = fi-as2 (m — 2). 8, , and 82, is the residual variance of the c,; about the 
regression line. ‘ 

The confidence intervals are obtained by the Fieller technique [29]. From a 
general study of this method by Junge [30], it may be inferred that when the 
coefficient of y’ is positive, the equation has only real distinct roots. Letting 7; 
be the smaller of the two roots, a closed confidence interval is obtained for No . 
If y2 > K — Kwai, the lower confidence limit yields less information than the 
fact that the initial population No must exceed K,,,, , the total catch. 

If the coefficient of y’ is negative, the roots of equation (25) may be real or 
imaginary. In the former case the confidence interval for No» is of the form 
(Kmii Ni) (Ne ©); in the latter case the trivial confidence interval (Kyn41, ©) 
is obtained. In general the probability of either of these situations occurring is 
extremely small. There is also a small, though still nonzero probability that in 
the case where the coefficient of y’ is positive, both y, and y, exceed K — Kn41, 
so that the confidence interval for No is degenerate. In practice, however, the 
occurrence of these cases will suggest a careful re-examination of the situation 
to determine whether Model V is indeed the appropriate model. 

DeLury has also considered the possibility of weighting the least squares 
estimates, though he suggests that such a procedure may be meaningless if the 
sampling is not random. This is very likely the case in utilizing commercial or 
sports catch records or in sampling schooling populations for example. For a 
further discussion of these points and of the method in general, reference is 
made to [7| and [8]. 

For the case where the effort is constant, Moran [21] has set down a model 
based on the assumption of random sampling. The model may easily be extended 
to the case where the effort varies from period to period. A somewhat more 
interesting extension is based on a combination of tag and sample and catch 
per unit of effort methods. The case of a closed population is still considered. 


Model VI 


No = initial population size, 
\k = probability that a unit of effort captures one member of the 
population. 


Unknown 
parameters 


(K; = cumulative number removed from populetion up to but not 
including the ith sample, 


Known 7 
= number of units of effort expended on the ith sample, 


parameters 


e; 
\t; = the number of tagged individuals remaining in the population 
lat the time the ith sample is taken. 





12 DOUGLAS G. CHAPMAN 


n; = size of the ith sample or catch, which is then removed from 
Random ; 
the population, 


\axi = number of tagged individuals in the ith sample (i = 1,2.--- r). 


’ 


variables 


It is assumed that n; has a Poisson distribution and that given n; , z; also hs 
Poisson distribution. With the usual proviso that the units of effort are i 
pendent, 

(26) &(n,) = k(No om Kye; 


and 
(27) &(a;|ns) = = 
Hence 


Pr(my, 2, °** M3 U1, La, °** By) 


n;! No — K; x;! 


i=] 


The maximum likelihood equations for k and N, are 


7 ni 
(29) k= ——"___ 
i e(N — Ky) 


t=1 

, , “ > ni 2. e; 
(30) Pu ; + uaF i a = ial . 

WEP iN — K; - 
~ e(N = K,) 

t=1 
The inverse of the variance-covariance matrix of k and NX, , expressed in terms 

of the K;, is: 


( r r 
: 2 ei(No — K,) » ei 


C im t=—1 


(31) | 


:¥ ej k > ] + 5 iedligie | 
im] im! No — Kj) 

7. Further problems. Each of the models set up and others that have been 
considered involves one or more assumptions which it is difficult or impossible 
to verify directly. For example, underlying the tag-and-sample models there is 
the assumption that tagged members of the population behave similarly to the 
untagged members, at least in respect to recapture. A primary assumption of 
the methods based on effort is that catchability is constant. 

Some empirical studies have been made to verify the estimates of populations 
by sampling methods. In some experiments conducted on fresh-water lakes the 





BIOLOGICAL POPULATIONS 13 


whole population has been poisoned out (a procedure that can hardly be recom- 
mended as an enumeration procedure except where the elimination of the existent 
populations has been the primary aim). The agreement has been satisfactory 
for some species but not for all—for example ef. Carlander [3]. It should be re- 
marked that sampling methods have often been necessary in connection with 
the estimates determined from the dead recoveries. 

Such methods of verification have at best limited application. It is necessary 
to design sampling experiment specifically for this purpose. In this connection 
it is suggested that combinations of the various methods outlined may be useful. 
This has been proposed by DeLury, [9]; his discussion of the underlying as- 
sumptions of sample census methods is particularly pertinent. 

Such combinations, of which Model VI is an example, may also yield more 
information than the application of a single method. Of course if the sampling 
is being done by a succession of commercial catching, Model VI is the appropriate 
one rather than Model I—though the heterogeneities introduced by such com- 
mercial catch sampling may suggest a regression model, that is, an extension of 
Model IIT. 

In Model I the numbers tagged and sampled were regarded as parameters; 
in actual fact they may also be random variables. For example the sample may 
be a commercial catch which includes elements from populations other than the 
one to be estimated. Subsam, Jes are taken from the commercial catch in order 
to estimate the number from the population under study, that is, n. The use of 
n in place of n is suggested. This complicates the interval estimation problem 


and while a crude determination of a confidence interval for No is possible by a 
step procedure, (a confidence interval for n is first obtained) this patently wastes 
information. The several variations of this situation that may arise suggest the 


necessity of a study of confidence intervals in connection with compound distri- 
butions. 


Referring again to Model I, it may be recognized from the outset that hetero- 
geneity exists within the sampling procedure. If it is possible to subdivide the 
tagging and sampling into periods (by time or area, for example), within each 
of which random sampling may be assumed, then it is possible to obtain con- 
sistent estimates, though the interval estimation problem is unsolved. This 
situation was first considered by Schaeffer [26]. 

An obvious extension of Model V is to assume that the probability of capture, 
rather than being constant over the population, is itself a random variable. The 
distribution of the probability of capture may perhaps be related to the ex- 
pected catch in any time interval. Additional information is available if different 
methods of capture are being used simultaneously—in fact in this case the 
restriction that the population is closed may be relaxed and an estimation pro- 
cedure set up for the population size at each time interval. 

As has been inferred, the interval estimation problem remains unsolved for 
many of the models, except for the large sample results. Correspondingly, the 
sample theory of tests in connection with such models has been given almost no 





14 DOUGLAS G. CHAPMAN 


attention. Some simple applications of the x° test have been given by Leslie 
[18] and by the author [5]. As more intricate experiments are designed and more 
careful control plans undertaken it will be necessary to consider tests for re- 
cruitment and mortality rates, for example. 


The complexities of estimating the birth, death, emigration and immigration 
rates indicate that it will be necessary to set up special experiments to adequately 
determine these factors. Some of the experiments set up by Jackson [15], where 
marking and recovery were carried on in a series of adjacent areas, were designed 
for this purpose. Random walk theory has been applied in one special situation 
by Gilmour, Waterhouse and McIntyre [11}. The study of birth and death proc- 
esses, and of processes associated with random as well as migratery movement, 
is necessarily associated with the population estimation problem and the latter 
will be completely solved only when the problems associated with these stochastic 
processes are resolved. 


REFERENCES 
PoPpULATION ESTIMATION 


{1} Norman T. J. Barvey, ‘On estimating the size of mobile populations from recapture 
data,’”’ Biometrika, Vol. 38 (1951), pp. 293-306. 
{2} Norman T. J. Battey, ‘Improvements in the interpretation of recapture data,’’ J. 
Animal Ecology, Vol. 21 (1952), pp. 120-127. 
(3) Kenneta D. CarRLaNpER AND WiiutaM M. Lewis, “Some precautions in estimating 
fish populations,’’ Prog. Fish. Cult., Vol. 10 (1948), pp. 134-137. 
[4] Doucias G. Cuapman, ‘‘A mathematical study of confidence limits of salmon popula- 
tions calculated from sample tag ratios,’’ Internat. Pacific Salmon Fisher ies 
Comm. Bulletin No. 2 (1948), pp. 69-85. 
{5} Douacias G. CuapmMan, “Some properties of the hypergeometric distribution with ap- 
plications to zoological sample ce. suses,’? Univ. California Publ. Stat., Vol. 1 
(1951), pp. 131-160. 
Dovcuias G. CHapman, “Inverse, multiple and sequential sample censuses,’’ Bio- 
metrics, Vol. 8 (1952), pp. 286-306. 
D. B. DeLury, “On the estimation of biological populations,’’ Biometrics, Vol. 3 
(1947), pp. 145-167. 
D. B. DeLury, ‘On the planning of experiments for the estimation of fish popula- 
tions,”’ Jour. Fish. Res. Bd. Can. 8 (1951), pp. 281-307. 
D. B. DeLury, ‘On the assumptions underlying estimates of mobile populations,’ 
Biostatistics Conference, Ames, Iowa, 1952 (to be published). 
W.H. Dowpeswe tu, R. A. Fisner anp E. B. Forp, ‘‘The quantitative study of popu- 
lation in the Lepidoptera. I. Polyommatus Icarus Rott.’’ Ann. Eugen., Vor. 10 
(1940) pp. 123-136 
D. Grimour, D. F. Warernouse ann G. A. McIntyre, “An account of experiments 
undertaken to determine the natural population density of the sheep blowfly 
Lucilia cuprina Wied.”’ Bull. Aust. Council Sci. Ind. Res., No; 195. 
Leo A, GoopMaN, ‘‘Sequential sampling tagging for population size problems,’’ Ann. 
Math. Stat., Vol. 24 (1953), pp. 56-69. 
[13] Pac. G. Hoe, ‘The accuracy of sampling methods in ecology,’’ Ann. Math. Stat., 
Vol. 14 (1943), pp. 289-300 
[14] C. H. N. Jackson, “Some new methods in the study of Glossina morsitans.’’ Proc. 
Zool. Soc. London, Vol. 4 (1936), pp. 811-896. 





BIOLOGICAL POPULATIONS 15 


. H. N. Jackson, ‘‘The analysis of an animal population,” J. Animal Ecology, Vol. 
8 (1939), pp. 238-46. 
. H. N. Jackson, The analysis of a tsetse-fly population,’’ Ann. Eugen., Vol. 10 (1940), 
pp. 332-369. 
. Lapuace, “Sur les naissances, les marriages et les morts,’’ Histoire de l’ Academie 
Royale des Sciences Annee 1783, Paris, p. 693 (actually published in 1786). 
.H. Lesuie, ‘‘The estimation of population parameters from data obtained by means 
of the capture-recapture method, II. The estimation of total numbers,’’ Bio- 
metrika, Vol. 39 (1952), pp. 363-388. 

(19) P. H. Lesure anp Dennis Carrry, “The estimation of population parameters from 
data obtained by means of the capture-recapture method, I. The maximum 
likelihood equations for estimating the death-rate,’’ Biometrika, Vol. 38 (1951), 
pp. 269-292. 

(20) P. H. Lesuie anp D. H. 8. Davis, ‘‘An attempt to determine the absolute number of 
rats on a given area,’’ J. Animal Ecology, Vol. 8 (1939), pp. 94-113. 

(21) P. A. P. Moran, “‘A mathematical theory of animal trapping,’’ Biometrika, Vol. 38 
(1951), pp. 307-311. 

[21a] P. A. P. Mor wn, “The estimation of death rates from capture-mark-recapture samp- 
ling,’’ Biometrika, Vol. 39 (1952), pp. 181-188. 

(22) J. Neyman, “‘On the problem of estimating the number of schools of fish,’’ Univ. Cali- 
fornia Publ. Stat., Vol. 1 (1949), pp. 21-36. 

[23] C. G. J. Perersen, ‘‘The yearly immigration of young plaice into the Limfjord from 
the German Sea, etc.,’’ Rept. Danish Biol. Sta. for 1895, Vol. 6 (1896), pp. 1-48, 
(cf. also later reports). 

24) Witiiam E. Ricker, ‘‘Methods of estimating vital statistics of fish populations,” 
Indiana Univ. Publ. Science Series No. 15, 1948. 

[25] Lesuiz W. Scatrercoop, ‘Estimating fish and wildlife populations: a survey 
of methods,’’ Biostatics Conference, Ames, Iowa, 1952 (to be published). 

[26] Miner B. Scnaerrer, ‘Estimation of the size of animal populations by marking 
experiments,’ U. S. Fish and Wildlife Ser. Fish. Bull. Vol. 69 (1951), pp. 191-203. 

(27) Z. E. Scunaser, “Estimation of the total fish population of a lake,’’ Amer. Math. 
Monthly, Vol. 45 (1938), pp. 348-352. 

[28] J. G. Sxeiiam, “Studies in statistical ecology. I. Spatial pattern,’’ Biometrika, Vol. 39 
(1952), pp. 346-362. 


Ss 


-t 


OrnerR REFERENCES 


[29] E. C. Fievuer, ‘‘The biological standardization of insulin,” J. Roy. Stat. Soc. Suppl., 
Vol. 7 (1940), pp. 1-65. 

[30] Cuarves O. Junas, Jr., University of Washington, 1953, unpublished manuscript. 

[31] H. B. Mann anv A. Waxp, “‘On stochastic limit and order relationships,’’ Ann. Math. 
Stat., Vol. 14 (1943), pp. 217-226. 

[32] C. P. Winsor anp G. L. Ciarxe, ‘A statistical study of variation in the catch of 
plankton nets,’”’ J. Marine Research, Vol. 3 (1940), pp. 1-34. 





A SINGLE-SAMPLE MULTIPLE DECISION PROCEDURE FOR 
RANKING MEANS OF NORMAL POPULATIONS WITH 
KNOWN VARIANCES! 


By Rospert E. BecHHorer 
Cornell University? 


Summary. This paper is concerned with a single-sample multiple decision 
procedure for ranking means of normal populations with known variances. 
Problems which conventionally are handled by the analysis of variance (Model 
I) which tests the hypothesis that k means are equal are reformulated as multiple 
decision procedures involving rankings. It is shown how to design experiments so 
that useful statements can be made concerning these rankings on the basis of a 
predetermined number of independent observations taken from each population. 
The number of observations required is determined by the desired probability 
of a correct ranking when certain differences between population means are 
specified. 


1, Introduction. In many of the experimental situations to which tests of 
homogeneity conventionally are applied, such as the F-test that k population 
means are equal, or Bartlett’s test that k population variances are equal, the 
tests (whether or not they yield statistically significant results) do not supply 
the information that the experimenter seeks. Thus in an agricultural problem the 
hypothesis that several essentially different varieties of grain have the same 
(population) mean yield is an unrealistic one since it is obvious that if the varie- 
ties actually are different, the (population) mean yields also will be different, 
and a sufficiently large sample will establish this fact at any preassigned level 
of significance. Moreover, should a significant result be obtained, the experi- 
menter’s problems usually have just begun. For having established that the 
varieties are different he may now desire to select the one which is ‘‘best.’’ Here 
the best variety might be defined as the one having the largest (population) mean 
yield. Whenever the experimenter ultimately is faced by the prospect of having 
to choose a best variety, it seems reasonable that the experiment should have been 
designed with this outcome in miud. What is needed then is a decision procedure 
which will tell the experimenter which population or populations to choose, and 
an operating characteristic which will tell him the probability of his making a 
correct choice if he follows the given decision procedure. The experiment then 
should be so designed as to control (in some sense) this probability at some pre- 
assigned level. 


Although the formulation of the problem as outlined above appears to be a 
reasonable one, little work along these lines has appeared in the literature. In 


Received 2/6/52, revised 8/7/53. 

1 Based on research sponsored by the Office of Naval Research. 

* The author initiated research on this project while he was at Columbia University 
16 





RANKING MEANS OF NORMAL POPULATIONS 17 


this connection three papers by Paulson [12], [13], [14], all involving multiple 
decision procedures, deserve special mention. In the first he considers the 2*—1 
decision problem of dividing a set of k population means into a “superior” 
and an “‘inferior’’ group; in the second he considers the k decision problem of 
determining the “best” of k populations when comparing k — 1 experimental 
populations with a control population; in the third he finds an optimum solu- 
tion to a k + 1 decision “slippage”? problem of Mosteller [10]. Duncan [5], 
[6] has considered multiple decision ~rocedures involving means. (It is not clear 
what kind of over-all confidence statement (with stated confidence coefficient) 
the experimenter can make if he uses Duncan's procedure.) Tukey [19] and 
Scheffé [16] have proposed very interesting alternate formulations of the analysis 
of variance problem; they sre concerned with making multiple comparisons 
among the means. 

The principal results of the present paper deal with a single-sample method 
of designing experiments to determine the ranking of k normal populations 
where the true ranking, concerning which information is sought, is based on the 
population means; in a later paper the writer intends to treat the similar situa- 
tion where the true ranking, concerning which information is sought, is based 
on the population variances. 


2. The test of homogeneity (analysis of variance) approach. The classical test 
procedure known as the analysis of variance was introduced by Fisher [8], [9] 
as a method of analyzing certain types of complex experiments and since has 
become one of the basic tools of the practicing statistician. At the time of its 


introduction the procedure represented a considerable contribution to the then 
available body of statistical techniques. Perhaps its greatest accomplishment 
lay in the fact that it stressed to the experimenter the principle of orthogonality— 
a principle which if carefully adhered to would permit him to extract from com- 
plex experiments considerable information concerning the effects of each of the 
factors that entered into the experiment; this same principle of orthogonality 
made it possible for him to test, without difficulty, hypotheses concerning the 
existence of these effects. 

It has been recognized by many statisticians that the analysis of variance 
has certain deficiencies. However, these deficiencies do not lie in the design 
aspects of the procedure, but rather in the types of decisions which are made 
on the basis of the data. The substantial contribution of experimental design 
(in the Fisherian sense) to the planning of a meaningful experiment cannot be 
overemphasized. However, there seems to be considerable doubt as to the 
utility of the tests of hypotheses which usually are the end products of any 
analysis of variance. Cochran and Cox in their excellent book Experimental 
Designs point out that “On the whole . . . tests of significance are less frequently 
useful in experimental work than confidence limits. In many experiments it 
seems obvious that the different treatments must have produced some differ- 
ence, however small in effect. Thus the hypothesis that there is no difference is 
unrealistic: The real problem is to obtain estimates of the sizes of the differences.” 





18 ROBERT E. BECHHOFER 


However, in many instances the purpose of the experiment is not to estimate 
the sizes of differences, but rather to find the “best” treatment or treatments. 
The method of estimating the sizes of differences often is used as a way of at- 
tempting, indirectly, to achieve this goal. The method described in the next 
sections is a direct approach to a solution of the ranking problem. 

In these sections we shall assume the same underlying mathematical model 
as is assumed for the analysis of variance (Model I). However, we shall reformu- 
late the purpose of our observation-taking. Instead of being interested in testing 
hypotheses that population means are equal, we shall be interested in making 
certain inferences concerning the ranking of these population means. It is im- 
portant to emphasize, however, that in this ranking approach, experimental 
designs such as randomized blocks, Latin squares, etc., will play the same role 
as they do in the analysis of variance. 


3. The ranking (multiple decision) approach: the one-way classification. 

A. Statement of the problem. Let X ;; be normally and independently distributed 
chance variables N(X;; | us, 0%), (¢ = 1,2, °°: ,k;j= 1,2, ---,N,). We assume 
that the u; are unknown; the o; are known and may be equal or unequal. Let 
vu) S wa) S *** Sup be the ranked yu; ; we assume that it is not known which 
population is associated with yu; (¢ = 1, 2,---, )k. 

We further assume that a population is characterized by its population mean, 
the “best” population being the one having the largest mean, the “second 
t sst”” being the one having the second largest mean, etc. (Alternatively, we 
tight have defined the ‘“‘best’’ population as being the one having the smallest 
mean, etc.; however, the mathematical theory is the same for both cases.) Thus, 
the k populations might be k different varieties of grain, and yu; might be the 
(population) mean yield per acre of the ith variety. We would like on the basis 
of a sample of N = }-‘_, N; independent observations to make some inference 
about the “bestness” of the populations. (This statement will be made precise 
later.) 

Our inferences will be based cn the sample means. The sample mean from the 
ith population will be denoted by X;. (For the sake of simplicity, no attempt 
will be made in this paper to distinguish notationally between chance variables 
and their observed values.) The sample mean, population variance, and sample 
size associated with the population having population mean y;;; will be denoted 
by Xw, ov) , and Ny , respectively, (i = 1, 2,---, k); that is, the expected 
value of Xi is uy and the variance of Xj is o7)/Ni» . The ranked X; will be 
denoted by 


(1) Xi < Xi < ++ < Xp. 


The event X; = X,(i # j) is an event of probability zero and can be ignored 
in probability calculations. However, in experimental situations this event can 
occur frequently because of the limitations of the measuring instrument. If it 
does occur, the tied means should be “ranked”’ using a randomized procedure 
which assigns equal probability to each ordering. 





RANKING MEANS OF NORMAL POPULATIONS 19 


In order to apply our procedures, the experimenter must decide what his goal 
is before he takes his sample. For example, his goal may be to find any one cf 
the following (or others unlisted): 


(2) The “best” population. 
(3) The “best two” populations without regard to order. 


The “best two” populations with regard to order. 


(5) The “best three” populations without regard to order. 


The choice of a goal will depend on economic and other considerations outside 
the control of the statistician. 

Having chosen a goal the statistical procedure is elementary. We take N;, 
observations from the ith population (¢ = 1, 2,---, k). We compute the k 
sample means X; , X;, --- , X,. We make the ranking (1). We then take action 
as follows. If our goal is to find (2), we make the statement, ““The population 
associated with X,; is the ‘best’ population.” If our goal is to find (4), we make 
the statement, “The populations associated with Xj; and Xu; are the ‘best’ 
and ‘second best’ populations, respectively.” If our goal is to find (3), (5), ete., 
we make similar statements. 

For fixed values of the wu; and o; (i = 1, 2,---, k) the proportion of correct 
statements that we make will depend only on the N,, but the proportion will 
differ, of course, for each kind of statement. We propose to design the experiment 
in such a way (that is, choose the N; in such a way) that under specified condi- 
tions the proportion of correct statements associated with our decision pro- 
cedure will be equal to or greater than some preassigned value. 

B. Expressions for the probabilities. The required probabilities of a correct 
ranking can be expressed in two basically different forms, as volumes under 
multivariate normal surfaces, or as iterated integrals. We shall consider both 
forms. In order to do so we first must state our goal. 

A general goal for the one-way classification of means can be expressed as 
follows. (See also (24)). To find 


The k, “best”’ populations, the k,_, “second best” populations, 
(6) the k,_,. “third best”’ populations, etc., and finally the k, “worst” 
populations. 


Here ki, ko, ---, ks S k) are positive integers such that > int ky = k. 
The probability of a correct ranking associated with (6) can be written as: 


, ' cn ( v ' 

» Xap} < min {Xu,40 » s+, Xesee}, 
4 ’ . v ' 
ett 9 °° * » Ke eeg} < Min [Xearpegen, '', X we, 40040}, 


? Xu-,)} < min {Xy edt) 9° ** 5 Xw}). 





20 ROPERT E. BECHHOFER 


If we assign particular values to s and the k, we obtain several special cases 
of interest, two of which we shali consider in some detail to illustrate the method. 
For example, for s = 2; k; = k — t; ke = t, we have 


(8) Pr[max {Xqw , Xe ,---, Xa-} < min {Xe-uy, +--+, Xw}] 
and fors = k; kj = ky = --- = ky = 1, we have 
(9) PriXa < Xm <--> < Xap < Xa). 


Stated in words, (8) is the probability that the “best ¢’’ populations will yield 
the largest sample means; thus (8) fort = 1, 2, 3 is the probability of a correct 
ranking associated with (2), (3), (5), respectively. 

If we consider (8) fort = 1 we have 


(10) Pr{max {Xa 9 Za 5 SPR Xu-v} < Xw] 

= Pri0 < Y,,0 < Y2, = * , 0 < Y;-1] 
where Y; = Xa) ~- Xw (¢ = 1, 2, ° ,k—1). Then E(Y;) = Buy — Bey = Oe,s 
(say) 


2 2 
2 o (k) . F (i) ; eo e eee == 
o (Y,) Nu + Nis for (i = 1,2, »k-— 1) 


2 
o(Y.Y;,) = Nn, for i # j(i,j, = 1,2, --+,k — 1), 
( 


and the Y; have a (k — 1) — variate normal distribution. 

If we denote the covariance matrix of the Y; by Z, and denote the row vectors 
(Yr, Yo, *** , Yer) amd (8p), deo, +++, Sear) by y’ and 6’, respectively, then 
(10) is given by 


‘ ‘t= —_ 
o ye. al [- ‘[ eNO dy dys +++ diya. 


If all of the means have the same variance, that is, if 


(12) oin/Nw = of (say), 


then (10) is equal to 


ke 4 oo +2 +00 las 
—i2z'Pr ez Lol 
(13) rk ~(k—1)/2 [ = | -  e i < dx, dx2 : 


Semny)/V/ 208 J (—dp.4~9)/4/ 208 —b4,1) 208k 
where P; = {pi;} is the kK — 1 by k — 1 correlation matrix with 
f ° . 
‘lfort = j ies 
(14) py = \t1,j) = 1,2, **: k= 1) 
(4 fori # j 


and x’ denotes the row vector (%, %2,°**, 2-1). Similarly, under condition 
(12), (9) is equal to 





RANKING MEANS OF NORMAL POPULATIONS 21 


FM ugte ” Y —je'PZ 1s 
(15) a | “ | OAR, [ ai 8°" dz dzq-+- dzp-1 
* (—8k,e—1)/ 20k J (—O4_1,n-2) 208 (—89,1)/y/2e8 


where P,; = {p;;} is the k — 1 by k — 1 correlation matrix with 
{ 


| 1 for ti=j 


(16) py= 4-4} for |i-—jl=1 


| 0 for ji—j]>1 Giant S:--b-f 
and 2’ denotes the row vector (2 , 22, *** , 2x1). 

The probability (7) always is expressible as a sum of integrals of the form 
(11), and if (12) is true, each of these is reducible to integrals of the form (13) 
and (15). These integrals cannot be evaluated in finite terms, and the precise 
determination of probabilities would in general require special tables. However, 
for k = 2 the probabilities (13) and (15) are identical, simply being areas under 
univariate normal curves; Eisenhart [7] has tabulated unity rm _— these proba- 
bilities as a function of 6.,/¢ and N for the special case oj = o3 = o and N, = 
N: =N. For k = 3 the probabilities (13) and (15) are volumes under bivariate 
normal surfaces with correlation coefficients p = +4 and p = —}4, respectively, 
and can be determined using [3] or [15]. For 4 S k S 10 the probabilities (13) 
can be determined using [11]; for k 2 4 the probability (15) would require 
special tables which have not yet been prepared. (For related tables see [11] 
and Section 3D, and the tables at the end of this paper.) 

In the first part of this section the probabilities (8) and (9) were expressed as 
volumes under multivariate normal surfaces. These probabilit’»s also can be 
expressed as iterated integrals and for certain purposes the latter form is more 
convenient. We shall illustrate the method for the probability (8) but shall do 
so only for the special case 


(17) a=a,Ni=N (i = 1,2,--+,k) 
which guarantees (12), and 
Bik] — Mie—t41) = O 
(18) Bu-e41) — weg = 6 (Say) 
By— — way = 0. 


(In Section 3C we shall refer to condition (18) as ‘‘the least favorable configura- 
tion of the population means.’’) 
When (17) and (18) hold we can write (8) as 


¢ Pr[max {Xa "Voge Xa-n} < Xo-uy < min {Xa-s42, sae xX »}] 


+0 1 (2 (h—441)—#[e-0})/ ol /®) —s2/2 vis 
=f | a ¥ : | 


+ . t—1 
—2*/2 

| sae 

(2 (e—04+1)—#[k-t+1))/ (ol JN) 


VN <N OR (pet 41) —# ike +1) 2/20? 1X 
e GA (k—t4+1) 
ov/ Qn 





22 ROBERT E. BECHHOFER 


which, after making the transformation y = /N(Xe-i4y — »:1e-e41)/e, yields’ 


+2 


(20) t [Fly + d)*‘fi — Fly) f) dy 


el 


where 


l ¥ ie 
(21) PW) = Ge |e" de, PW) = fy) = 
T Jax 


and 


o 


= WNXsay) 


where } is the standardized difference between the population means. If d 
the probability (20) is equal to 


t(k—@! 1 

Bd: «We 
The integral (20) is easier to evaluate numerically than the integral (13) even 
though the former is a more general case of (8) than the latter. 


A more general goal than (6) also can be formulated. As an example of a 
special case of this we shall find 


(23) 


(24) ty of the t “best’’ populations (1s & &S 2). 
For (24) the probability of a correct ranking is given by 
- t! nT ’ k-t t—to TT | 
a — Fl 
(25) G@= &) (te = 1)! r [F(y + dF)! — Fy)’ fy) dy 


under the assumptions (17) and (18). We note that (25) reduces to (20) for t 
= ¢. In certain situations the experimenter may be willing to relax his require- 
ments and specify (24) as his goal‘ rather than the corresponding case of (6). 

C. Determination of the sample sizes. The “distances’”’ of the k populations 
from each other can be expressed in terms of the k — 1 parameters 


(26) Signs = Ms — BE 

To simplify notation let 

(27) Diath; = k; (say). 

Then we note that of the k — 1 parameters (26), s — 1 of them, namely, 

(28) bi 41k, (¢ = 1,2,---,8— 1) 
exercise a general over-all control on the probability (7), since if the parameters 


(28) are “small” the probability (7) is relatively low and if they are “large” 
the probability (7) is high. (For example, for (8) the parameter 6,—¢41,.—-. 18 the 
* Paulson obtained this integral for the special case t = 1; see equations (2.2) in [12] and 
(2) in [13]. 
‘ The formulation of the problem as given in this paragraph was suggested to the writer 
by Dr. Milton Sobel who also derived the expression (25). 





RANKING MEANS OF NORMAL POPULATIONS 23 


controlling one. If this parameter is made arbitrarily small, the probability (8) 
lies between (23) and one-half; by increasing this parameter, the probability (8) 
can be made arbitrarily close to unity.) It is obvious that for fixed nonzero 
values of the parameters (28), and for fixed population variances o{ , the proba- 
bility (7) can be made arbitrarily close to unity by making the N; (¢ = 1, 2, 
-++ , k) sufficiently large. But if one or more of the s — 1 parameters (28) is 
very small, the N = >~'_, N; required to realize any probability close to unity 
will be extremely large. 

Now in most experimental situations there seems to be little if any reason 
for attempting to differentiate between any pair of populations characterized 
by mwé, and yi, if the corresponding parameter (28) is very small since the 
expense involved in guaranteeing a high probability of a correct ranking may 
be prohibitive and/or the economic loss involved in making an incorrect ranking 
may be negligible. In fact, in most, situations it should be possible to specify 
s — 1 constants 


(29) bf 41.8; i= l, 2, rea }) 


which are the smallest values of the parameters (28) which are ‘worth detect- 
ing.’’ We shall assume that these are given in what follows. 

It is our purpose to find the smallest N = }-‘_, N; which will guarantee a 
specified probability y < 1 of a correct ranking whenever 6j,4:i; 2 8f 41,4, 
(i = 1,2,---, 8 -- 1). As a device for doing this we consider the least favorable 
configuration of the population means. This configuration is defined as being the 
one which, for fixed N; and o; (i = 1, 2, --- , k), yields the greatest lower bound 
of the probability of a correct ranking. Since the probability (7) is a strictly 
increasing function of each of the parameters 6,;4:,;(¢ = 1, 2,°--,k — 1), it is 
easy to see that the greatest lower bound is achieved when each of the s — 1 
parameters given by (28) has the corresponding parameter values given by 
(29), and each of the k — s parameters given by (26) but not given by (28) 
has the value zero. The desired N = 2 elie is then the smallest one which will 
guarantee the probability y for the least favorable configuration. Of course, 
the efficient choice of the N; will depend on the o; (i = 1, 2, --- , k). For fixed 
N; and o; (i = 1, 2, --- , k) the probability (7) considered as a function of the 
6i41,4(0 = 1,2,--- ,k — 1) givesan analogue of power and might be termed the 
operating characteristic curve with respect to a correct ranking for the procedure. 

(i) Variances known and equal. If o, = o (i = 1, 2,---, k) where o’ isa 
known constant then it would appear to be most efficient to choose equal sample 
sizes from each population. (In this and the next two subsections, the most 
efficient allocation of the sample sizes will be defined as the one which, for 
fixed total sample size, maximizes the minimum probability of a correct ranking. 
The writer does not claim at this time that all of the procedures described in 
this paper are most efficient.) We choose the common sample size N’ in such 
a way that N = ‘<1 N; = KN’ is the smallest integer which will guarantee 
y for the least favorable configuration. The probability (7) then will prove to 
be a function of only the k; (¢ = 1, 2, --- , 8) and the s — 1 constants 





24 ROBERT E. BECHHOFER 


AT a? 
(30) g/ 5 Sst 
~ a 


For example, for (8) we have s = 2 and 


(31) N’ Bhy+1e, = N’ bp-e41.4-¢ 
aes. Be Sipe pee 


(ii) Variances known and unequal. If o; = ajo*(i = 1, 2, --- , k), whereo’ isa 
known constant, and the a; are known constants not all of which are equal to 
unity, then it may be desirable to choose the sample sizes so that the variances 
of the sample means are equal. This choice is not most efficient. (If we restrict 
our attention to procedures for which the sample sizes are taken to make the 
population variances of the sample means equal, then it can be shown the 
minimax procedure for (2) is: “select as the ‘best’ population the one having 
the largest sample mean.’’) However, it has a very important practical advantage; 
namely, that the tables which give the probability of a correct ranking for the 
special case oj = 0: = --- = of and Ni = N, = --- = N;, then become ap- 
plicable. In order to apply these tables when the N; (i = 1, 2, --- , k) are subject 
to the restriction 

z 2 2 
(32) oi o2 oe 


2 = 


eg eee W 


we proceed as follows. We act as if the k populations had the common variance 
o* which is the known constant referred to above. Using the method of the 


previous section, we find N = kN’ where N’ is the number of observations taken 
from each population. We then set 
2 


2 
(33 Sn SP es Sy ae 222° 3-48 
} N; N,” N’ , , ; , ) 


from which it follows that we choose the individua! N; so that 
(34) N,; = a;N’ 


If any N,; so chosen is not an integer, we replace it by the next largest integer. 
Because of (33), it is clear that these N; guarantee y. 

As was indicated above, for fixed total sample size (34) does not define the 
most efficient choice of the N; for arbitrary a;. For example, for k = 2 it can 
be shown that the most efficient method of choosing N; and N; is to select that 
pair (N,, Nz) which satisfies the equations Neo, = N,o., and Ni; + Nz = N 
where o; and o2 are known and N is specified. For k > 2, the rule by which a 
most efficient choice of the NV; (i = 1, 2, --- , k) ismade appears to be too com- 
plicated for practical application; also, the number of tables needed would be 
prohibitively large. 

(iii) Variances unknown. If the values of the oi(i = 1, 2, --+ , k) are completely 


9 


“pn . 2 2 2 
unknown (or even if it is known that oj = o2 = --- = o, where the common 





~ 
on a 
RANKING MEANS OF NORMAL POPULATIONS 25 


value of the variances is unknown), it is not possible using a one-sample pro- 
cedure to make any useful statement concerning the magnitude of the confidence 
coefficient. (Actually the experimenter faces the same type of dilemma here as 
he does when he desires to make a statement about the power of an analysis of 
variance test when the variance is unknown.) For this problem an analogue of 
Stein’s two-sample procedure [17] has been developed to provide a solution of 
the ranking problem. This new procedure will be described in a forthcoming 
paper. (See [1].) 

D. Discussion of tables. Tables have been prepared to assist the experimenter 
in designing and interpreting experiments for ranking means. 

Tal¥e I is to be used for designing experiments involving k normal populations 
to decid@ghich ¢ have the largest (or smallest) population means. The table is 
based & probability (20), and gives the value of d = 1/N) associated with the 
probahilitfes 0.05 (0.05) 0.80 (0.02) 0.90 (0.01) 0.99. 0.995, 0.999, and 0.9995 
for k *2(1) 10 and ¢ = 1(1) [k/2] (as well as k = 11(1) 15 and selected values 
of t) where {k/2] k = 1,2, --- , 15 is the largest integer less than or equal to k/2. 
The table is based on the least favorable configuration of the population means 
which, for picking the ¢ largest, is given by wy) — wey) = 0, we—e4t) — BE = 
Qo, He — wy = O, and for picking the ¢ smallest is given by the same expressions 
with ¢ replaced by k — t. The values of d were obtained by inverse linear inter- 
polation in [11]. 

Table II is a special table to be used for designing experiments involving 3 
normal populations to decide which one has the largest, which the second largest, 
and which the smallest population mean. The table is based on probability 

~ (15) for k = 3, and gives the value of d = +/N) associated with the probabilities 
¥e, 0.20 (0.05) 0.80 (0.02) 0.90 (0.01) 0.99. The least favorable configurat.on 
Mis) —' 42) = a) — #p) = Ao is assumed throughout. The values of d were 
obtained by inverse interpolation in [3] and [15]. For convenience of tabulation 
“* the standardized differences between the population means were taken as equal. 
A table for unequal differences could be prepared using [15]. 
Examples of the use of the tables are given in Section 8. 


4. The ranking (multiple decision) approach: the two-way classification with- 
out interaction. 

A. Statement of the problem. Let Xjjm be normally and independently dis- 
tributed chance variables N(Xjjm|u + ai + B;, oi), @=1,2,°°', rj = 


2 = 
1,2,---,ce;m = 1,2, --- , N;;), with > tat a; = ee 8; = 0. We assume that 
u, the a; , and the 8; are unknown; the o4; are known and may be equa! or un- 
equal. Let an) S ay; S +++ S ay andBy, S Ba, S --- S Bi be the ranked 
a; and 6; , respectively; we assume that it is not known which populations are 
associated with either a; or B,;; . 

5 These tables were computed for this project by the National Bureau of Standards, at 


the Institute for Numerical Analysis, Los Angeles; the computations were supported by 
the Office of Naval Research. 





26 ROBERT E, BECHHOFER 


As in Section 3A we assume that a population is characterized by its popula- 
tion mean which for the two-way classification (no interaction) consists of two 
components of interest each one of which measures a classification  ‘‘effect.”’ 
The “best” set of populations with respect to the first classification is the one 
consisting of those populations having population means u + aj, + 8; 
(j = 1, 2,---, ¢); the “best” set of populations with respect to the second 
classification is the one consisting in those populations having population means 
ut a; + By (@ = 1,2, ---, 1); the “second best,” etc. sets of populations with 
respect to either the first or the second classifications are defined in the obvious 
way. Thus the re populations might be the re combinations of r different varieties 
of grain and c different types of fertilizer, and u + a; + 8; might be the (popula- 
tion) mean yield per acre of the ith variety treated with the jth fertilizer. (Here 
we are assuming no variety-fertilizer interaction.) We would like on the basis 
of a sampie of N = }°1., 5°45. Ni, independent observations to make infer- 
ences about the “‘bestness’’ of the populations for each of the two classifications. 

Our inferences will be based on the sample means which will be denoted by 


» (i= 2.-°-,9r 
(35) = ~ . , I, ’ , 
7; = 1, 2, °° *s@)s 


(36) ‘, 1), 


and 


(37) J ‘ae (j= 1,2, --+,¢). 


The sample mean, population variance, and sample size associated with the 
population having population mean uw + aq + Bi, will be denoted by X cy: , 
714), and Nip jy, respectively, (i = 1, 2,---,7;7 = 1, 2,-->, c); that is, 
the expected value of X¢).j) is w + aq + By,) and the variance of Xj 
is einw/Nwu . 


We also define 


* 
z Xia 


(38) an j=l 


and 


(39) 





RANKING MEANS OF NORMAL POPULATIONS 


The ranked X,. and X., will be denoted by 
(40) Xin. < Xin. < +--+ < Ky. 
and 
(41) 


respectively. 

Goals for the two-way classification are of the same type as for the one-way 
classification except that they consist of two parts. For example, the experi- 
menter’s goal may be to find any one of the following (or others unlisted): 


The “best” set of populations according to the first classifi- 
(42) cation and the “‘best”’ set of populations according to the second 
classification. 


The “best” set of populations according to the first classifica- 
(43) tion and the “best two” sets of populations without regard to 
order according to the second classification. 


Having chosen our goal, we take N;; observations from the 7, jth population 
and compute the r + ¢ sample means (36) and (37). We make the rankings 
(40) and (41). If our goal is to find (42), we make the statement, ‘“The set of 
populations associated with X,,;. is the ‘best’ set of populations according to 
the first classification and the set of populations associated with X.;.; is the 
‘best’ set of populations according to the second classification.” If our goal is 
to find (43), etec., wé would make similar statements. For fixed values of the 
a;, 8;, and oij(i = 1,2,+°°, r;j = 1,2,--+ ,c) the proportion of correct state- 
ments that we make will depend only on the N;; . 

B. Expression for the probabilities. A general goal for the two-way classifica- 
tion is similar to (6) except that it consists of two parts. We shall not write it 
explicitly nor shall we write the associated probability of a correct ranking. 
However, to illustrate the method of evaluating such a probability we shall 
consider the following special case in some detail: 


(44) PriXa. < Xe. < -+* < Xe. and Xu < Xi < °°: < XW] 
= Pr{0 « Y,,0 < Y2,°**,0 < Y,_; and 0 <Z,,0 < Z2,°** ao < Z.-1} 
where 
Y; = Xoas). — Xw. (¢ = 1,2,-*-,r— 1) 
(45) 
Zi = Xsan — Xp (j= 1,2,°--,¢— 1). 


We note that the Y, and Z; have a joint (r + ¢ — 2)-variate normal distribution. 
If the means in each cell have the same variance, that is, if 


(46) oii/Nij = oF (say) (i= 1, 2, ip of3J = 1, 2, baie ¢), 





28 . ROBERT E, BECHHOFER 


then the joint covariance matrix of the Y; and Z; simplifies considerably and 
we have 


a (Y;) 


a > 
ogee 


-,f — 1) 


%ibs Sib 


o(Z;) = 


Q 
“ae 


-,e—- 1) 


jt¢-jl=1 


Ql|— 
Q 
ee 


o(YiY;) 


a(Z;Z;) 
0 


o(¥iZ;) =0 ( = 1,2,---,r—1; 


The fact that o(Y;Z;) = O (all 7, 7) implies that the Y;,’s are independent of 
the Z,’s (a sufficient condition for the Y;’s to be independent of the Z,’s is that 
within every row or every column the variances of the means are equal), and 
we have that (44) is equal to 


(48) Prf(0 < Yi, -°--:,0< Y,4JPr[0 < Z,,---,0 < Zu]. 


Thus, if (46) holds, the probability for the two-way classification reduces to 
the product of the probabilities for two one-way classifications. 

It is important to note that if it 1s desired to increase the first of the proba- 
bilities in the product (48), this is accomplished (for fixed c) by decreasing 
of defined by (46), that is, by increasing the N;;. But increasing the N;; also 
has the effect of increasing the second of the probabilities in the product (48). 
Thus the factorial design of the experiment makes the data “work twice’’ and 
is in this sense more efficient than two separate experiments. 


5. The ranking (multiple decision) approach: the r-way classification without 
interaction. For this problem the X;,;,...;, are normally and independently dis- 
tributed chance variables 


° , ' r 2 : r 
N(X ijig...i-m | + 2 hal Qi; » Tizig...i,) With Diet a;, = 0, 


(2; a °3 337 1,2,°°:, » mM = ‘ tty Minis ir) 


We would like on the basis of a sample of Dojo. Do7/.1 Ni,s,...x, independent 
observations to make inferences about the “bestness” of the populations for 
each of the r classifications. This problem is a straightforward generalization 
of the case r = 2 treated in the previous section. 


6. The ranking (multiple decision) approach: experimental designs. Designs 
such as randomized blocks, Latin squares, etc. are used in experimentation to 





RANKING MEANS OF NORMAL POPULATIONS 29 


eliminate the effects of heterogeneity in one or two directions. Their use results 
in a reduction in the underlying variance of the experiment, and it therefore is 
possible to make more precise comparisons among the “treatment effects.”’ 
These designs serve the same function in the ranking approach as they do in the 
analysis of variance. We shall illustrate this point wit randomized blocks. and 
the carry-over to other more complex designs will be immediate. 

We assume the same mathematical model for the randomized blocks design 
as for the two-way classification without interaction. However, in this case we 
are not concerned with the (block) effects 8; since the blocks are introduced 
only to reduce the oi; . We define our sample means as we did for the two-way 
classification. Since the expected value of the differences between the treatment 
(row) means involves only the a;, our problem is thrown into the form of the 
one-way classification which we have considered already. 


7. Large sample applications of the ranking theory. The results obtained in 
the previous section can be used Lo rank parameters other than the population 
means of normal distributions, provided that sufficiently large samples are avail- 
able, and the statistics that are used to estimate these parameters are normally 


distributed in the limit. Since reasonably large sample sizes usually are required 
to achieve the desired probabilities in ranking problems, and since the central 
limit theorem applies under very general conditions, many ranking ptoblems can 
be*solved using the already-developed normal theory. 

In many problems the approach of the statistic to normality will be accelerated, 
and the dependence of the mean on the variance will be minimized, if the statistic 
is appropriately transformed. Thus, for example, the population probabilities 
of “success” in binomial distributions, or the population means of Poisson 
distributions can be ranked using the transformations are sin 1/% and W/Z 
respectively. Similarly, the population correlation coefficients of bivariate 
normal distributions can be ranked if the transformation z = 4 log, 
{a + r)/Q — r)] is used. 


8. Examples. Several numerical examples will be given here to illustrate the 
use of the tables. It will be assumed that the mathematical models of Section 
3 and Section 4 hold for Examples 1 and 2 and Example 3, respectively. No 
attempt will be made to relate these examples to any particular subject matter 
field. 

Example 1. Given a one-way classification of three populations. Suppose 
that it is desired to find which population has the largest mean, and to guarantee 
that the probability of correctly choosing that population will be at least 0.75 
when y\3) — #2) 2 4. How many observations must be taken from each popula- 
tion? 

Refer to Table I, column headed k = 3,¢ = 1. 

a) Suppose that it is known that ¢j = ¢3 = 03 = o° = 100. Then we follow 
the method of Section 3C(i). Entering the table we find that the value of ~/NA 
associated with a probability of 0.75 is 1.4338. We have \ = 4/9. Thus, 0.4 -/N 
= 1.4338, and hence select 13 observations from each population. 





TABLE I 


Table of \/Nd corresponding to various probabilities, to be used for designing 
experiments involving k normal populations to decide which t 
have the largest (or smallest) population means 


Prob. of | 
Correct 
Ranking | 


t=] | = ) t = ] 
0.9995 4.6535 | .9163 | 5.06389 | 5.1699 
0.9990 | 4.3703 64 4.7987 .9098 

9950 | 3.6428 3.9! 1224 | 4.2490 


99 
.98 
97 
. 96 
.95 


. 2900 
.9045 
.6598 
.4759 
3262 


3.7970 
3.4432 
0232 3.2198 
8504 3.0522 
.7101 2.9162 


. 9323 3.9196 
. 5893 3.5722 
.3734 3.3529 
.2117 3.1885 
.0808 3.0552 


bo wo © 


oe oe 


bo 


94 
93 
.92 

91 


. 1988 
.0871 
. 9871 
.8961 
.8124 


. 5909 
.4865 
.3931 
. 3082 
. 2302 


. 8007 
. 6996 
6092 
.527] 
.4516 


. 9698 
.8728 
. 7861 
7075 


.6353 


.9419 
. 8428 
7942 
.6737 


.5997 


—t it it BD HOD 
nDnwysw wv bv 
mDnwnnny 
bd b& Ww b& WL 
bn th bv by bv 


.6617 
.5278 
. 4064 
2945 
. 1902 


.0899 
. 9655 
.8527 
. 7490 


6524 


bo 


.3159 
. 1956 
.0867 
. 9865 
. 8932 


5057 
.3910 
. 2873 2423 
. 1921 . 144] 
. 1035 2.0528 


.4668 
.3489 


bo b> bh bh 


2 
] 
] 
] 
| 


— mt DO bo 
bo bw NO dO bo 


. 9539 .4338 .§822 . 9038 .8463 
7416 . 2380 .4933 1253 .6614 
. 5449 .0568 .3 186 . 59609 .4905 
. 3583 . 8852 . 1532 .4055 3287 
1777 .7194 . 9936 . 2559 1726 


. 0000 .5565 . 8368 . 1093 .0193 
.3939 . 6803 . 9633 . 8662 

. 2289 .5215 . 8156 oa aki 

.0585 .357 . 6635 .5510 

1855 | . 5039 . 3827 


.0000 .3320 .2014 
. 1424 .0000 





RANKING MEANS OF NORMAL POPULATIONS 


TABLE I—Continued 


Prob. of 
Correct 
Ranking 


0.9995 5.3127 | 5.2439 5.4116 5. 5.3066 
0.9990 5.0584 - 9856 5.1611 5.2043 5.0505 
0.9950 4.4138 | 4.3280 4.5270 4.5756 4.3989 


0.99 . 1058 
0.98 3.7728 
0.97 3.5635 
0.96 3.4071 
0.95 3.2805 


.O121 2244 4.2760 .0861 
. 6692 3.8977 3.9530 3.7466 
4528 3.6925 3.7504 3.5324 
. 2906 3.5393 3.5992 3.3719 
. 1591 3.4154 3.4769 3.2417 


ww wwe 


0.94 3.1732 
0.93 3.0795 
0.92 2.9959 
0.91 2.9201 

.90 2.8505 


0474 3.3104 3.3735 
. 9496 3.2187 . 2831 
. 8623 3.1370 3.2026 
. 7829 3.0628 | 3.1296 
.7100 2.9948 3.0627 


1311 
.0344 
.9479 
. 8694 


1972 


no Ww 


NNN ww 


bo bo bo 


.88 . 5789 
.4627 
.3576 
. 2609 
. 1709 


8729 
. 7651 
.6677 
. 5784 


.4955 


nN 


9427 
8368 
7411 
.6535 
.5720 


. 6676 


5527 


.4486 
.3530 
. 2639 


bh bo 


b> bo 


84 
.82 


- to te to to 
bo Nh bo bo 

bo hw db tb to 

bh bo 

bh bo te 


bo 


. 9674 
. 7852 
.6168 
.6706 4575 
5277 . 3037 


. 3086 
. 142] 
. 9888 
8443 
. 7054 


~ 


bo 


. 3887 
2256 
.0756 
. 9342 
. 7985 


.0626 
.8824 
7159 
. 5583 
4062 


bo bo 
~ 


b 





— ot 


—— 
— = RD 


. 3879 . 1526 . 5694 .6657 
. 2488 .0019 4343 . 5339 
. 1081 . 8491 2077 — -4007 
. 9635 .6915 . 1573 . 2640 
.8119 .9257 .6103 . 1209 


. 2568 
. 1078 
. 9567 
. 8008 
. 6369 


oo - = 


a 
_— 


6492 | 0.3472 | 8525 | . 9675 .4604 
4691 0.1489 | . 6780 .7979 . 2643 
. 2605 176 6019 .0364 
.0000 . 22 .3576 

.0000 





ROBERT E. BECHHOFER 


TABLE I—Continued 
Prob. of ; : . % 
Correct , ; 4 . . S 
Ranking ‘ 3 


. 9995 5.: 5.5501 5.3590 5.5480 5.6244 
. 9990 5.239% 5.3052 5.1047 5.3023 5.3821 
. 9950 4.6127 4.6867 4.4579 4.6815 7710 


.99 4. 4.3926 
.98 3.9917 4.0758 
97 3.7895 3.8773 
. 96 3.6385 3.7293 
.95 3.5164 3.6097 


.1475 .3858 .4807 
.8107 4.0669 4.1683 
. 5982 3.8668 3.9728 
.4390 3.7175 3.8270 
. 3099 3.5968 3.7093 


02 Oo CO 


w 


.94 
.93 
.92 
91 
.90 


.4130 
.3228 
. 2423 
. 1693 
. 1024 


. 5086 
.4203 
3417 
. 2704 
. 2051 


. 2002 
. 1043 
.0186 
.9407 
. 8691 


.4946 3.6097 
.4054 3.5229 
.3208 3.4456 
. 2537 3.3755 


. 1876 3.3113 


wwww w 
nN tS WW &w 
WwWwww wo 


www w wo 


.88 . 9824 
. 8764 
. 7806 
.§929 


.6113 


.0880 
. 9847 
8915 
. 8061 


. 7269 


. 7406 
.6266 
5235 
4286 
3403 


w 


.0691 3.1963 
.9644 3.0948 
8698 3.0032 
. 7832 2.9194 
4027 2.8416 


). 84 


bo to tk ht & 
bo tO Ww WO bo 


9 
2 
9 
2 
9 
2 
9 


bo ht bh bo 


4277 
. 2641 
. 1137 
.9719 
.8355 


. 5485 2.1407 
. 3899 .9621 
. 2442 .7970 
.1071 .6407 
.9754 .4899 


5215 2.6666 
.3601 2.5111 
.2116 2.3683 
.0718 2.2340 
.9374 2.1051 


— ee DO bo bo 
— bo bo bo bo 
m DO bo bo bo 


. 7022 . 8468 3418 . 8059 .9792 
.5697 .7191 1941 | .6753 .8543 
.4358 . 5903 044: . 5434 . 7284 
. 2982 .4581 . 88f .4079 . 9992 
. 1542 .3198 127% . 2660 4641 


.9997 1717 | .5523 | .1139 1.3195 
8288 .0081 3576 .9457 1.1599 
.6312 | .8192 A .7511 0.9757 
8846 | .5840 5085 0.7465 
0232 | .2403 | 1530 0.4118 





RANKING MEANS OF NORMAL POPULATIONS 


TABLE I—Continued 


Prob. of | at 
Correct | 4 


Ranking - = 


t=] 


€ 


0.9995 | 6463 5. 5.5988 . .7196 
0.9990 | .4049 | ; 5.3550 5. 4809 
0.9950 | .7966 | . 508 .7388 4. .8798 


99 | 4.5078 | 4. 4455 5 5950 
.98 1972 | 3. .1292 . 2888 

.0029 3. | 3.9308 4. | 4.0974 
96 3.8581 | 3. .7829 3. | 3.9548 
95 3.7412 3. 3.6633 


94 | 6424 | . 3.5620 
93 | 5562 ; 3.4736 
.92 4794 | : 3948 
91 .4099 | ‘ 3234 
.90 3462 | | 2579 








a 
— 


88 | . 2322 

.1316 
84 0408 
82 | .9577 
.80 .8807 


. 1405 
.0368 
. 9433 
.8575 
.7778 





NS dw Ww bw 


0. 
0 
0 
0 


bo 


75 .7074 
.70 | 5535 

| .4122 
60 | .2794 
55 | .1520 


5984 
-4387 
. 2919 
1535 
.0206 


ooooo 
— — me RD DD 


50 0276 | 1. | 1.8908 
45 | 1.9042 | 1. | 1.7615 
40 | 1.7798 | 1. 6311 
35 6523 4971 
30. | 1.5191 3559 





oooo °o 


25 1.3765 2065 
.20 | 1.2192 0403 
15 1.0377 | 8481 
.10 0.8121 | 0.6085 

0.4829 0.2575 





ROBERT E. BECHHOFER 


TABLE I—Continued 
Prob. of 
Correct 


k = 10 k= 10 k = 10 
Ranking ‘=. 


. 7343 5.7788 5.7924 
.4958 . 5422 5.5563 
. 8950 9468 | 4.9625 


0.9995 | 5.4432 5.6425 
6.9990 | 5.1917 5.4000 
0.9950 | 4.5523 | 4.7878 


~- cro 


0.99 
0.98 
0.97 
0. 


. 2456 
-9128 
. 7030 
5457 


.4182 


4964 
. 1823 
. 9854 
8385 
.7198 


-6100 4.6648 | .6814 
. 3037 4.3619 3796 
. 1120 . 1727 1911 
. 9693 .0319 .0509 
8541 .9184 3.9378 


Co Oo OO me 
WoW he 
aor > > 


w 


. 3099 
. 2152 
- 1305 
.0536 
. 9829 


6193 
.5316 
. 4534 
. 3826 
.3176 


7567 3.8224 3.8422 
.6718 3.7387 3.7589 
. 5962 . 6643 3.6848 
-5277 3.5969 3.6177 
.4650 3.5351 3.5563 


NW WwW WwW & 
wwww nw 
Www w wo 


bo 


- 8560 
. 7434 
-6416 
5479 
.4608 


.2011 
.0983 
.0055 
. 9203 
.8413 


w 


3.3526 
. 2535 
. 1642 
.0824 
0065 


.4246 3.4463 
.3272 3.3494 
. 2395 3.2621 
. 1591 3.1822 
.0847 3.1082 


i) 


wWwww 


2 
2 


Www ww 


i) 
w 


. 2637 
.0873 
. 9242 
.7700 
.6210 


bo 


. 6635 
.5051 
3595 
. 2224 


.0907 


to 


. 8360 
6845 
5456 
4149 
. 2896 


9174 2.9419 
. 7690 2.7944 
. 6330 .6592 
.5052 | 2.5322 
. 3827 2.4106 


bh bo to 
ht bd b> bo 


~ 
~ 


— me e BO bo 
bo bh} bt bo 


t 
b 


4748 | .9618 
. 3289 .8339 
. 1810 . 7047 
.0284 .5720 
.8679 . 4330 


. 1673 
.0460 
9237 
. 7984 
. 6674 


2632 2.2920 
.1448 2.1744 
0256 | 2.0561 
.9035 .9350 
.7760 .8085 


oe 
— ee = DO bo 
— = DD dO DO 


6951 . 2841 1.5273 
. 5032 1195 1.3727 
. 2800 . 9292 1.1944 
.0000 .6919 0.9728 

3444 0.6495 


.6398 .6733 
.4896 5244 
.3166 3529 
1017 . 1401 
. 7889 . 8303 


Cor Fe 





RANKING MEANS OF NORMAL POPULATIONS 


TABLE I—Continued 


Prob. of 
Correct 
Ranking | 


k= 11 


k = 12 
t=Z t= 3 
8149 
.5790 
.9853 


5.6807 7773 | 65.8284 | 5.8511 
4395 5.5402 5.5934 5.6170 
8305 9432 5.0025 5.0288 


- oa | 


.5408 . 6602 
.2286 | . 3560 
.0329 . 1658 
.8869 | 4.0242 
3.7689 | 3.9099 


. 7229 4.7506 
4227 | 4.4522 
2353 | 4.2660 
. 0958 1274 
. 9834 4.0158 


. 7039 
.4016 
.2126 
.0719 
.9584 


or » >» b 
on > b 


6691 3.8133 
5819 | 3.7291 
5042 | 3.6541 
3.4338 | 3.5862 
3.3693 | 3.5239 


w 


. 8883 
. 8055 
.7318 
. 6652 


. 6041 


.9214 
. 8392 
. 7661 
. 6999 
. 6393 


w 


. 8624 
. 7788 
. 7043 
. 6369 


5751 


o.5 
w 
wo w 


ooo o © 
nw 


wWwww wo 


ww w 
w 


w 


. 2536 3.4126 
.1514 3.3143 
.0592 . 2258 
-9747 3.1447 
. 8963 3.0695 


.4948 
. 3984 
3117 
. 2323 
. 1587 


. 5309 
4354 
.3494 
.2707 
. 1978 


.4645 
. 3670 
2791 
1986 
. 1240 


w 
ww 


www 
wWwww 


0.8 
0. 
0. 
0. 


oS 
oe 
w 
wo w 


w 
w 
w 


i) 


.7196 
. 5624 
-4179 
. 2818 
.1510 


. 9006 
7505 

6129 
.4835 
3594 


. 9934 
. 8468 
.7125 
. 5863 
.4654 


w 


.0341 
8890 
. 7560 
.6312 


5117 


9563 
8075 
.6709 
. 5426 
.4196 


b> bt bo 


ooosfo 
bd Nw bh bo 

NS bo 

NS bo bw bo tO 
bw bo bo to 

 o dS bd to to 


t 


tb 


.0231 2384 
8961 .1183 
.7679 =| 1.9973 
6362 | .8733 
4984 | 1.7438 


i) 


~ 


3476 
. 2309 
. 1133 
- 9930 
. 8673 


3952 
. 2799 
. 1638 
.0450 
.9210 


. 2995 
. 1805 
.0606 
. 9377 


. 8093 


Sas 
tr 
nN bw 





w 
on 
— 





oocoo 


— im DD DO bo 
— bo bt bo bo 


w 
oO 


to 
or 


nt 


3507 - 6052 
. 1874 .4524 
. 9985 . 2761 
. 7632 0571 
.4186 7376 


7331 
. 5852 
.4149 
. 2035 
. 8958 


. 7886 1.6720 
.§428 1.5206 
4749 | 1.3460 
2667 | 1.1291 
. 9640 0.8128 


— 
aoc on 
Ce eel 
oH SS 





ROBERT E. BECHHOFER 


TABLE I—Coneluded 





Prob. of 
Correct 
Ranking | 


k = 12 k= 12 





.8709 5.9002 
6373 5.6678 ° , 5.7494 
-0502 5.0841 5. wl . 1728 


0.9995 
0.9990 | 
0.9950 


on 


on 


. 8083 ; .8576 - 9005 
.5126 ‘ . - 6089 
.3281 ° ‘ 4271 
. 1909 ‘ a .2919 
0803 : - 135: . 1831 


99 
98 
97 


. 7725 
-4746 
. 2886 
. 1502 
0387 


coco o 
~~ eS 
>> > > 


95 


94 
93 


. 9444 
. 8623 
. 7893 
. 7232 
. 6626 


.9870 ; ; | 0911 
.9057 3. ; $.4 0111 
8333 | 3. | i 3.9399 
.7678 3. ; 8756 
7079 | 71k BE? .8166 


91 
90 


WWW Ww Ww 
wCwww w 


.88 
86 


6007 | 7113 
5063 | 3. 3.5664 6185 
4213 | , | .4822 | . 5350 
3435 ' 4052 | 4586 
.2715 3. . 3339 .3879 


5543 
-4588 
.3729 
. 2942 
.2213 


0 
0. 
0. 
0.8% 
0. 


WwWwww wo 
wow ww w 


.0577 
9125 
. 7796 
6547 
. 5352 


w 


1098 ' 1739 | 3.2292 
9666 3.0321 .0887 
.8354 .9023 | 2.9600 
.7122 .7805 | 2.8394 
5944 59! 6640 .7240 





nN ww 


oooc°o 
tN bh bt 


bo bo 
i) 


-4186 
. 3032 
. 1870 
-0680 
. 9439 


bo 


4796 | . 5505 6116 
.3659 4382 .5003 
2515 .3252 . 3885 
1345 . 2096 .2741 
0124 ’ .0890 . 1548 





m— bt dO bd to 
bo bh bt 


bo 


.8113 
- 6652 
.4970 
. 2883 
. 9848 


. 8821 ‘ . 9604 
. 7387 ‘ .8188 
.5736 ‘ . 6560 
. 3690 . 4542 
.0716 ; . 1611 


> 


J J 


a 
@ bw 

o> 
Oo f oo 


os) 





ee 
bo 


| Ss 


} —- — — a= — 
=] 
= 








RANKING MEANS OF NORMAL POPULATIONS 


TABLE II 
Table of ~/N corresponding to various probabilities, to be used for designing 
experiments involving 3 normal populations to decide which one has the largesi, 
which the se second largest, and which the smallest population n mean 


Prob. of Prob. of na | Prob. of Ce 
Correct /N Correct | Correct /N 
Ranking Ranking Ranking 


0.99 3.6428 0.88 ; | 0.50 0.9084 
0.98 3.2900 86 086 0.45 0.7836 
0.97 3.0690 84 9855 0.40 | 0.6592 
0.96 2.9044 | .82 8935 0.35 0.5328 
0.95 2.7717 .80 8094 0.30 0.4021 


0.94 2.6598 
0.93 2.5623 
0.92 2.4756 


6211 | 0.25 . 2635 
.4560 0.2 1121 
. 3064 





.0356 


| 
0.91 | 2.3974 | 0.60 | 1.1674 % | 0.0000 
0.90 | 2.3258 | 1.035 


0.55 


b) Suppose that it is known that oj = 90, 03 = 130, and o} = 191. Following 
the method of Section 3C(ii) we see that we can let o° = 100; a; = 0.90, az = 1.30, 
and a; = 1.91. Then we have } = 4% o. From a), above, we see that 
N’ = (1.4338/0.4)*. Using equation (34) we find that N; = 11.6, N, = 16.7, 
N; = 24.6; thus we select 12, 17, and 25 observations from populations 1, 2, 
and 3, respectively. 

Example 2. Given a one-way classification of three populations. Suppose 
that we have selected 15 observations from each of the populations. What is 
the smallest difference ys; — uo, = 42) — wp) that we can guarantee detecting 
with probability at least 0.80? 

Refer to Table II. 

Entering the table we find that the value of ~/NA associated with a proba- 
bility of 0.80 is 1.8094. If o, = o, = o; = o is known, say equal to 6 units, we 
have V 15 (wien _ ua) /6 = 1.8094 fori = 1 Bs hence M3) ~~ Mi) = BQ) = 2) 
= 2.80 units; if the variances are completely unknown, no useful statement 
can be made. 

Example 3. Given a two-way classification (2 rows, 3 columns) of six popula- 
tions, each having the same variance o’. Suppose that it is desired to find which 
set of populations has the largest row mean, and which set of populations has 
the smallest column mean, and to guarantee that the probability of correctly 
choosing these two sets will be at least 0.60 when aj, — ayy 2 0.2 ¢ and By, — 
Bu; 2 0.4c. How many observations must be taken from each population? 

Refer to Table I, columns headed k = 2,4 = landk = 3,t = 1. 

For a sample of size 9 from each population we have (9)(2) = 18 and (9)(3) = 





38 ROBERT E. BECHHOFER 


27 observations contributing to the column and row means, respectively. For 
k = 2,t = 1 we haved, = \; VN, = 0.2 0/27 = 1.0392, and hence the as- 
sociated probability lies between 0.75 and 0.80; for k = 3, t = 1 we have d, = 
he VN, = 0.4 V/18 = 1.6971, and hence the associated probability lies between 
0.80 and 0.82. Interpolation will show that the associated probabilities are 
equal to 0.7688 and 0.8094, respectively, and hence their product is equal to 
0.6223. If a sample of size 8 is taken from each population, the corresponding 
product is equal to 0.5960. Hence select 9 observations from each population. 

Example 4. Given three bivariate normal populations with unknown popula- 
tion variances, covariances, and correlation coefficients. The population cor- 
relation coefficient associated with the ith population will be denoted 
by pi(t = 1, 2, 3); the ranked p; will be denoted by py; S py; S pys; . It is not 
known which population is associated with p;,, . Suppose that it is desired to 
find which population has the largest correlation coefficient, and to guarantee 
that the probability of correctly choosing that population will be at least 0.90 
with pjs) = 0.7 and pis; — py; 2 0.10. How many observations must be taken 
from each population? 

Refer to Table I, column headed k = 3, t = 1. 

The quantity d = VYNA = VWN(u;) — uy)/o now is replaced by 
VN — 3(4 loge (1 + pya))/(1 — pis) — 4 loge (1 + pi3)/(1 — py) = VN —3 
log. 1.1902 = 0.174 ~/N — 3 = 2.2302 (from the table, for P = 0.90). Hence 
select 168 observations from each population. 


9. Directions of future research. The results presented in this paper can be 


extended and generalized in several directions. The formulation of problems in 
terms of the ranking (multiple decision) approach rather than the test of homo- 
geneity approach can be applied equally well to parameters other than popula- 
tion means of normal distributions. As an example of this, the writer has con- 
sidered the problem of ranking the population variances of normal distributions. 
The results of this investigation, giving an exact rather than a large sample theory, 
will be presented in a laier paper. 


Ranking problems can be formulated as several sample, or completely se- 
quential (rather than single sample) multiple decision procedures with resultant 
savings in the expected number of observations for a given probability of a 
correct ranking. Some promising results have been obtained thus far [2], [18], 
but many interesting unsolved problems remain, and additional research in 
this area should prove very fruitful. 

Among the unsolved problems for the single sample procedure, two are of 
particular interest. It would be very desirable to know whether the multiple 
decision procedures described in this paper are optimum in any sense. Also, 
it would be useful to have a simple procedure for determining the most efficient 
allocation of the sample sizes when the population variances are known and 
unequal. 





RANKING MEANS OF NORMAL POPULATIONS 39 


10. Acknowledgments. The writer is indebted to Mr. Cuthbert Daniel for 
hay..g proposed to him a practical problem which led the writer to formulate 
the ranking approach. 

Particular appreciation is recorded here for the efforts of Dr. Milton Sobel 
who read early versions of this manuscript with painstaking care, and who made 
many very helpful suggestions. 

REFERENCES 

{1} Ropert E. Becnnorer, Cuartes W. Dunnett and MILTON Soset, ‘‘A two-sample 
multiple decision procedure for ranking means of normal populations with 
unknown variances (Preliminary Report),’’ abstract, Ann. Math. Stat., Vol. 
24 (1953), p. 136. 

[2] Robert E. Becnuorer and Mitton Sosget, ‘‘A sequential multiple decision procedure 
for ranking means of normal populations with known variances (Preliminary 
Report),”’ abstract, Ann. Math. Stat., Vol. 24 (1953), p. 136. 

[3] Gerrrupe BLancu, NAML Report 52-53, National Bureau of Standards, Los Angeles, 
January 5, 1952. 

[4] Witt1am G. Cocuran and Gertrupve M. Cox, Experimental Designs, John Wiley and 
Sons, Inc., New York, 1950. 

[5] Davin B. Duncan, “A significance test for differences between ranked treatments 
in an analysis of variance,’’ Virginia J. Sci., Vol. 2, N. 8. (1951), pp. 171-189. 

[6] Davin B. Duncan, ‘On the properties of the multiple comparisons test,’’ Virginia 
J. Sci., Vol. 3, N. 8. (1952), pp. 49-67. 

[7] CuurcuiLy Etsenuart, ‘Probability that sample means are in opposit. order to popu- 
lation means,”’’ Selected Techniques of Statistical Analysis, McGraw-Hill Book 
Co., New York, 1947, pp. 377-382. 

[8] R. A. Fisner, 7'he Design of Experiments, 4th ed., Oliver and Boyd Ltd., Edinburgh 
and London, 1947. 

(9] R. A. Fisner, Statistical Methods for Research Workers, 10th ed., Oliver and Boyd Ltd., 
Edinburgh and London, 1946. 

[10] Freperick Mosrevuer, “A k-sample slippage test for an extreme population,’’ Ann. 
Math. Stat., Vol. 19 (1948), pp. 58-65. 

[11] National Bureau of Standards, Personal communication on unpublished tables. 

{12] Epwarp Pautson, ‘‘A multiple decision procedure for certain problems in the analysis 
of variance,’’ Ann. Math. Stat., Vol. 20 (1949), pp. 95-98. 

[13] Epwarp Pau.son, “On the comparison of several experimental categories with a con- 
trol,’? Ann. Math. Stat., Vol. 23, (1952), pp. 239-246. 

[14] Epwarp Pautson, “An optimum solution to the k-sample slippage problem for the 
normal distribution,’’ Ann. Math. Stat., Vol. 23 (1952), pp. 610-616. 

[15] K. Pearson, Tables for Statisticians and Biometricians, Part II, lst ed., Cambridge 
University Press, 1931 

[16] Henry Scuerrsé, ‘‘A method for judging all contrasts in the analysis of variance,”’ 
Biometrika, Vol. 40 (1953), pp. 87-104. 

[17] Cuarves Srern, ‘“‘A two sample test for a linear hypothesis whose power is independent 
of the variance,’’ Ann. Math. Stat., Vol. 16 (1945), pp. 253-258. 

[18] Cuartes Srern, ‘On sequences of experiments,” abstract, Ann. Math. Stat., Vol. 19 
(1948), p. 117. 

[19] Joun Tuxey, ‘Allowances for various types of error rates,’’ unpublished invited ad- 
dress presented before a joint meeting of the Institute of Mathematical Statistics 
and the Eastern North American Region of the Biometrics Society at Blacksburg, 
Va., March 19, 1952. 





NORMAL MULTIVARIATE ANALYSIS AND THE ORTHOGONAL GROUP! 
By A. T. James 
Princeton University 


1. Summary. New methods are introduced for deriving the sampling dis- 
tributions of statistics obtained from a normal multivariate population. Exterior 
differential forms are used to represent the invariant measures on the orthogonal 
group and the Grassmann and Stiefel manifolds. The first part is devoted to a 
mathematical exposition of these. In the second part, the theory is applied; 
first, to the derivation of the distribution of the canonical correlation coefficients 
when the corresponding population parameters are zero; and secondly, to split 
the distribution of a normal multivariate sample into three independent dis- 
tributions, (a) essentially the Wishart distribution, (b) the invariant distribu- 
tion of a random plane which is given by the invariant measure on the Grass- 
mann manifold, (c) the invariant distribution of a random orthogonal matrix. 
This decomposition provides derivations of the Wishart distribution and of the 
distribution of the latent roots of the sample variance covariance matrix when 
the population roots are equal. 


2. Introduction. Much of the distribution theory of normal multivariate 
analysis can be deduced from, or is closely related to the fact that the distribu- 
tion of a normal multivariate sample is invariant under orthogonal trans- 
formations. 

Consider a set of n independent observations from a normal! k-variate dis- 
tribution (n 2 k) with a nonsingular variance covariance matrix 2. In most 
distribution problems one can eliminate the population means with the loss of 
1 degree of freedom by a suitable orthogonal transformation. Assume this has 
already been done. Let the rows of the n X k matrix X be independent observa- 
tions from a normal k-variate distribution with zero means; 

|2\" 


‘ my —jtr(Z~ 1x’ 
(2.1) dF(X) = (Qn) é - ” * TI dz;;. 


The distribution is clearly invariant under the transformation 


(2.2) H: X — HX 


where H is an n X n orthogonal matrix. The invariance is a fundamental prop- 
erty of dF, indeed, as Bartlett [1] has proved for the univariate case, and a similar 
proof holds for the multivariate case, the invariance under (2.2) together with 


Received 10/16/49, revised 6/25/53. 

1This work was initiated in the Section of Mathematical Statistics, Commonwealth 
Scientific and Industrial Research Organization, Australia and completed during the 
tenure of a C. 8. I. R. O. studentship at Princeton University and subsequently with sup- 
port from the Office of Naval Research. 


40 





MULTIVARIATE ANALYSIS 41 


the independence of the rows of X uniquely characterizes the distribution 
(2.1) of X. 

With probability one, the columns of X, regarded as vectors in n-dimensional 
Euclidean space R", span a k-dimensional linear subspace (henceforth called 
k-plane). Hotelling [8] observed that the invariance of the distribution of X 
under the group of transformations (2.2) implies that the k-plane is invariantly 
distributed. (A formal proof of this result will be given in Section 6.) He also 
recognized that the problem of finding the distribution of the canonical correla- 
tion coefficients could be reduced to the problem of finding the distribution of 
the cosines of the critical angles between the plane spanned by the columns of 
X and a plane distributed independently of X, or a fixed plane. From these 
observations Hotelling went on to obtain the distribution of the canonical 
correlation coefficients for the special case of two canonical correlation coeffi- 
cients (assuming the population correlations are zero). The general distribution 
was later derived by Fisher [5], Hsu [9], Roy [16], Girshick [6] and Mood [13], 
using different methods. 

To complete the derivation of the general result along the lines followed by 
Hote!ling, one requires a convenient analytic expression to represent the in- 
variant distribution of a random plane. Such an expressi.:. would also be very 
useful in other connections. The most obvious way to obtain such an expression 
would be as follows. A k-plane in R” can be specified by a system of k(n — k) 
parameters, in fact, in many ways. The parameters will then have a certain 
distribution in R““"~” corresponding to the invariant distribution of the random 
plane. However, such methods lead to intractable expressions because, as we 
shall see later on, they destroy the symmetry of the space of k-planes. 

Instead, we consider the k-planes in R” as points of a space which is an analytic 
manifold, the Grassmann manifold. Blaschke [2] has constructed an exterior 
differential form on the Grassmann manifold which may be considered as the 
probability density for an invariantly distributed random plane. By a simple 
transformation the exterior differential form may be expressed in terms of the 
critical angles, and hence the distribution of the canonical correlation coefficients 
obtained. This will be carried out in Section 7. 

Another analytic manifold, important in multivariate analysis, is the Stiefel 
manifold. A set. of k orthonormal vectors in R” is called a k-frame. The k-frames 
are the points of the Stiefel manifold. Both the Grassmann and the Stiefel mani- 
folds are coset spaces of the orthogonal group which is also an analytic manifold. 

The theory of Grassmann and Stiefel manifolds, exterior differential forms, 
etc. used in this derivation, is familiar to the differential geometer; but its litera- 
ture is widely scattered, and not readily accessible to the statistician unless he 
is prepared to go far more deeply into these subjects than is required here. It 
therefore seems desirable to give an outline of those parts of the theory that we 
require, in a form suitable for immediate application to problems of multivariate 
statistics. Sections 3 to 5 are devoted to this. 

Exterior differential forms on manifolds have evolved from a simple rule for 





42 A. T. JAMES 


transforming multiple and surface integrals in Euclidean space. It is based on 
an anticommutative multiplication of differentials (see Goursat [7] and Kahler 
{11]). As Chern [3] has pointed out, it has potential application in statistics. 
Although it is equivalent to the calculation of the Jacobian, it is usually simpler 
because it avoids the necessity of explicitly writing out bulky determinants. 

Section 3.1 gives the definition of an analytic manifold. In Sections 3.2 to 
3.4 the three analytic manifolds to be considered in this paper, namely the 
orthogonal group and the Grassmann and Stiefel manifolds, are defined and the 
relationship of the Grassmann and Stiefel manifolds as coset spaces of the orthog- 
onal group is explained. 

Exterior differential forms are introduced in Section 4.1 and their integrals 
defined in Section 4.2. The transformation of them is discussed in Section 4.3 
and it is shown how an invariant differential form yields an invariant measure. 

The exterior differential forms representing the invariant measures on the 
orthogonal group and the Grassmann and Stiefel manifolds are constructed in 
Sections 4.5 to 4.7 and their integrals are evaluated in Sections 5.1 to 5.3. 

Sections 6 and 7 give the derivation of the distribution of the canonical corre- 
lation coefficients, as outlined above, and in Section 8 the results stated in the 
summary on the decomposition of the distribution of a normal multivariate 
sample, are proved. Olkin [14] has given this decomposition by what amounts 
to using parameters for the Grassmann and Stiefel manifolds based on the Cay- 
ley parameters for the orthogonal group. 


3. The orthogonal group and its coset spaces. 

3.1. Analytic manifolds. An n-dimensional manifold, IN, is a Hausdorff topo- 
logical space in which every point p has a neighbourhood ©, with a system of 
coordinates 2? , --+ , xk , that is, such that the map re 2f, --- , 22 (r ¢ O,) is 
a one-to-one bicontinuous map (homeomorphism) of ©, on an open set in real 
Euclidean space, R". The coordinates z? , --- , x2 will be referred to as coordi- 
nates centred at p. They are also coordinates centred at any other point of ©,. 

If cf? ,---, 22 and 2{,---, 24 are the coordinates of a point re O,N O, 
relative to coordinate systems centred at p and q respectively, then since the 
correspondences 


Ot Bea oe 

are homeomorphisms, it follows that 2? , --- , 22 and z{,--- , x4 are continu- 
ous (single valued) functions of each other. A manifold, together with a set of 
overlapping coordinate systems, which cover the entire manifold and have the 
property that the transformation between any two overlapping coordinate 
systems is analytic, is called an analytic’ manifold. (A function defined on R" 


* It would be sufficient for the applications in this paper only to assume that the func- 
tions have continuous derivatives, thus defining differentiable manifolds. But as we are 
applying the theory to the orthogonal group, the Grassmann and the Stiefel manifolds 
which are not only differentiable but indeed analytic, we may as well assume analyticity. 





MULTIVARIATE ANALYSIS 43 


is called analytic in a domain if it can be expanded as a convergent multiple 
power series in the neighbourhood of any point of that domain.) The systems of 
coordinates possessing the requirec properties are called admissible. 

A familiar example of an analytic manifold is the surface of a unit sphere in 
Euclidean space, for example, in R*. A system of coordinates, centred at any 
point p of the sphere, can be obtained by taking the orthogonal projection of the 
open hemisphere with p as pole on the plane tangent to the sphere at p. This is 
obviously a homeomorphism of the open hemisphere on the interior of the unit 
circle in the tangent plane. Introduce coordinate axes in the tangent plane and 
let (x? , x?) be the coordinates of the projection on the tangent plane of a point 
r in the hemisphere. Then (x? , x?) serve as admissible coordinates for r. The 
transformations between two such coordinate systems centred at p and q re- 
spectively can be shown to be analytic in their domain of overlap. 

More generally, the construction of admissible coordinate systems by projec- 
tion on the tangent plane can be applied to show that any algebraic variety 
which has a tangent plane at every point, is an analytic manifold. (An algebraic 
variety is a surface in Euclidean space determined by a system of algebraic 
equations). In particular, the orthogonal group and Stiefel manifold, which we 
shall now discuss, are analytic manifolds. 

Derinition. A function f defined on an analytic manifold is an analytic 
function in the domain ® if, for any arbitrary coordisates x; , --- , z, admissible 
in a domain ©, f is an analytic function of z;, --- , 2, in the domain D/ ©. 

3.2. The orthogonal group O(n). An n X n matrix, A, satisfying the equation 
A’A = I, where I, is the identity matrix and A’ means the transpose of A, is 
called an orthogonal matrix. An equivalent definition is that A is the matrix of a 
linear transformation which leaves the quadratic form 2j + --- + 2% invariant. 
The set of alln X n orthogonal matrices with the operation of matrix multiplica- 
tion is called the orthogonal group, O(n). 

There are n(n + 1) functionally independent conditions on the n° elements 
of an orthogonal matrix A ¢ O(n); consequently, the elements of A can be re- 
garded as the coordinates of a point on a 4n(n — 1)-dimensional algebraic 
variety or surface in Euclidean n’-space. Since 2;,;a;; = n, the group surface 
is a subset. of the sphere of radius Vn in n’-space. 

In 1896, Hurwitz [10] pointed out that the element of area of the group sur- 
face is a two-sided invariant measure on O(n), that is, invariant under left and 
right translations, by which we mean that the respective transformaticas 


(3.1) A-—HA 
H e¢ O(n) 
(3.2) A—- AH 


leave the element of area invariant. For, suppose X is an n X n matrix, re- 
garded as a vector in an n dimensional space, and transformed by H e¢ O(n); 


(3.3) X—-HX or X—-XH. 





44 A. T. JAMES 


These are linear transformations of the n’-space which leave the quadratic 
form 2,.;x7; invariant. Therefore they are orthogonal transformations (of order 
n’ Xn’). But the area of a surface in Euclidean space is invariant under or- 
thogonal transformations. Hence the area of the group surface in n’-space is 
invariant under (3.1) and (3.2). 

The invariant measure is sometimes referred to as the ““Haar’’ measure on 
the orthogonal group, named after Haar, who, in 1933, generalized Hurwitz’s 
result by proving the existence of an invariant measure on any locally compact 
topological group. Herglotz and Blaschke [2] have derived an exterior differen- 
tial form for the invariant measure on O(n) which is, (apart from a scale factor), 
a convenient expression for the area of the surface. We shall derive it later on. 

3.3. The Stiefel manifold V,,, . Let us call a set of k orthonormal vectors in 
Euclidean n-space, a “‘k-frame’’. The k-frames are the points of the Stiefel mani- 
fold, V;.., . Regarding the k vectors of a k-frame as the columns of a matrix A, 
we can represent the Stiefel manifold as the set of n X k, (k S n), rectangular 
matrices, A, satisfying the equation A’A = J,. Vi, is an $k(2n — k — 1)- 
dimensional algebraic variety in nk-dimensional Fuclidean space and an analytic 
manifold. The same argument as for the orthogonal group shows that the ele- 
ment of area of this surface is a measure invariant under (3.1). 

A group of transformations is said to act transitively in a space if, given any 
two points of the space, there is an element of the group which transforms one 
into the other. Such a space is said to be homogeneous with respect to vhe group. 
If 2» is any point of a homogeneous space X (with respect to a group $) and Ho 
is the subgroup consisting of all elements of which leave x invariant, and if 


h ¢ © transforms 2» into x, then the set of all elements of § which transform 
% into x is the coset hH, . Hence the points z ¢ X are in one-to-one correspond- 
ence with the cosets hDo. Thus a space, homogeneous with respect to a group 
of transformations, may be regarded as a space of cosets of the group. 

The Stiefel manifold is obviously homogeneous with respect to the orthogonal 
group of transformations acting on V;,, according to (3.1). If Ao ¢ Vx,» say for 


simplicity 


ff 


then the group O, which leaves A» invariant is the set of square matrices of the 
form 


a 
0 Hux 


where H,_, is any n — k X n — k orthogonal matrix and J, is the unit matrix 
of order k. 


Hence the coset corresponding to A ¢ Vix,» is 


I, 


0 O(n — k) 


[A | B} 





MULTIVARIATE ANALYSIS 45 


where B is any n X n — k matrix such that the partitioned matrix [A : B] is 
orthogonal and O(n — k) is the group of orthogonal matrices of order n — k.* 

3.4. The Grassmann manifold. The points of the Grassmann manifold, Gy, , 
(r = n — k) are the k-dimensional pianes (passing through the origin) in Eu- 
clidean n-spa¢e, R”. For our purposes, the following obviously equivalent defini- 
tion is useful. Consider the set, X, of all m & k matrices (k S$ n) of rank k; 


am *** -e 


Zn*** Len | 


and the group of transformations Y — XL where L is any nonsingular k X k 
matrix. The group defines an equivalence relation in ¥. Two elements of ¥ are 
equivalent if there is an element of the group which transforms one into the 
other. Such is possible ‘f and only if the column vectors of the two matrices 
span the same k-plan« .n the Euclidean n-space, R", of column vectors. Hence 
the equivalence classes of X¥ are in one-to-one correspondence with the points 
of the Grassmann manifold, G;.,, . 

G,.,, is an analytic manifold. It has dimension k(n — k), because XY may be 
regarded as a point in Euclidean nk-space and for each fixed X the set of all 
elements XL in the equivalence class is a surface in R™ of dimension k’. Hence 
the dimension of G,,, is nk — k’ = k(n — k). 

Like the Stiefel manifold, the Grassmann manifold can be regarded as a coset 
space of the orthogonal group O(n). An orthogonal transformation of R” trans- 
forms k-planes into k-planes; thus it induces a transformation of G,,,. We shall 
use the same symbol for the induced transformation of G;,, as for the original 
transformation of R”. In this sense, the orthogonal group O(n) is a transitive 
group of transformations of G,,, because, given any two k-planes in R", there 
exists a rotation which transforms one into the other. 

If po is any fixed point of G,,, and Op is the subgroup of all elements of O(n) 
which leave pp invariant, and if H e O(n) transforms pp into p ¢ G,,, , then the 
set of all elements of O(n) which transform pp» into p are the elements of the coset 
HO, . Hence the cosets HO, , H ¢ O(n), are in one-to-one correspondence with 
the points p ¢ Gy, . 

Suppose, for simplicity, we let po be the plane spanned by the first k coor- 
dinate axes. The cosets are then of the form 


O(k) 0 
0 O(n — k) 


(3.4) [A 


where the first k columns of the matrix on the left are orthonormal vectors 
spanning the plane p and the last n — k columns are likewise orthonormal 
vectors, but they span the orthogonal complement of p. The matrix is thus an 


* The orthogonal group manifold can be expressed as a fibre bundle with the Stiefel mani- 
fold as the base space and the subgroup O(n — k) as the fibre. Steenrod [18] discusses the 
Stiefel and Grassmann manifolds from this point of view. 





46 A. T. JAMES 


element of O(n). The matrix on the right denotes the subgroup O» of O(n) con- 
sisting of all matrices which are a direct sum of orthogonal matrices of orders 


k and n — k respectively. Oo is the subgroup which leaves the plane spanned 
by the first k coordinate axes invariant. 


4. Exterior differential forms on manifolds. 


4.1. Definition. Consider a multiple integral over a domain A in Euclidean 
space R”; 


(4.1) hm ff flor, -++ a) des <-> dy. 
4 


On making a change of variables 


My = M(t, °*** , Un) 
(4.2) . 
m= 2Xn(U1,°°* , Un) 


we have 


eo , Ox; “ee 
(4.3) k [ f(x(u)) det (%) du, du,, . 


To calculate the Jacobian, instead of writing out the matrix of partial deriva- 
tives (@x,;/du;) and calculating its determinant, we can evaluate it in the follow- 
ing way. Differentiate the transformations (4.2); 


n a . 

(4.4) dx; = >> mu, Ot 

and substitute the linear differential forms (4.4) in (4.1;) 

(4.5) k= [ f(x(u)) (= <2: du) (= iu, du;) ; 


Now multiply out the differential forms in (4.5) in a formal manner using the 
associative and distributive laws, but instead of the commutative law use an 
anticommutative rule for multiplying differentials; that is, put 


(4.6) du; du; = —du;du;. 


In particular —du,; du; = du; du; = 0. 

The justification for this formal procedure is that the rules are consistent 
and lead to the correct result as given in (4.3) (see Goursat [7] chap. 3). In fact, 
the formal procedure is equivalent to calculating the Jacobian as is shown by 
the following 

Lemma 4.1. If du is a column vector of n diffeventials and if dx = J du, where J 
is ann X n matrix and thus dx is a column vector of linear differential, forms, 
then the anticommutative product of the elements of dx is the anticommutative 
product of the elements of du multiplied by | J | ; that is 





MULTIVARIATE ANALYSIS 


(4.7) [] i dz; = | J | [] hi du,. 


Proor. The left-hand side of (4.7) is clearly equal to []*, du, multiplied by 
a polynomial p(J) in the elements of J, which is linear in each row of J. Inter- 
changing the order of two factors, say dz; and dz; , reverses the sign of [] 7.1 dz; . 
However, it is also equivalent to interchanging the ith and jth rows of the 
matrix J. Thus interchange of two rows of J reverses the sign of p(J). Finally, 
if J is the identity matrix then p(J) = 1. Hence, according to the Weierstrass 
definition of a determinant, p(J) = | J |. The formal procedure may also be 
used to transform surface integrals. 


An exterior differential form of degree r in Euclidean space R” is a formal 
expression of the type 


(4.8) Dla<te<e. <ty Uis...ip(X) dz,, mrt i dxi, 


where u;,...;,(@) are analytic functions of 2, --- , 2, . It may be regarded as 
the integrand of an r-dimensional surface integra!. The exterior product of a 
form of degree r with a form of degree s is defined as the form of degree r + 8 
which is obtained by formal multiplication of the two forms using the associa- 
tive, distrivutive, and anticommutative laws for the multiplication of the sym- 
bols dz; . 

A form of degree n has only one term, namely u(x) dz; --+ dz, . A form of 
degree greater than n is zero because it has at least one of the symbols dz; re- 
peated in each term. 

The definition may be extended to define an exterior differential form on an 
analytic manifold M. Relative to a system of admissible coordinates on the 
manifold, it is an expression of the type (4.8). 

DEFINITION. An exterior differential form w(p), p ¢ M, on an analytic manifold 
is a system of expressions of type (4.8), one for each admissible coordinate 
system, such that if 2,,---, 2, and y:,---, y, are two coordinate systems, 
then in the domain of overlap of these coordinates the corresponding expressions 
of type (4.8) with coefficients u;,...;, and v;,...;, respectively, are related by the 
transformation 


Ox; Ox; 
y Ori ii. {2 oe 
(49) cote Us,...¢,(a(y)) (z 3; ay;) ( By, dv) 


r 


= 30,,...:,(y) dys, «++ dyi,. 


The exterior product of two exterior differentiai forms w and v is defined as 
the exterior differential form whose representation in any coordinate system is 
the product of the representations of w and v in that, coordinate system. One 
can check that the resulting product transforms according to the rule (4.9) on 
change of coordinates. Hence it constitutes an exterior differential form on the 
manifold. 


If po is a fixed point of I with coordinates x°, we shall call the expression 


w(po) = Dui,...s,(x°) dai, «++ dai, 





48 A. T. JAMES 


the value of w(p) at po . The values 
u(x") dx, + +++ + tn(2°) dxn 


of linear differential forms at a fixed point p) of the manifold can be considered 
as vectors in an n dimensional vector space with dz, , --- , dz, as its basic vec- 
tors. This space is called the tangent space to the manifold at pp» . It is the ana- 
logue of the tangent plane to a surface in Euclidean space. 

Dertnition. The differential of an analytic function f on a manifold is the 
linear exterior differential form represented in coordinates x , --+ , 2, by 

ig of 
df = oz, ™ 4. er + 57 din. 
The ordinary rules of calculus show that such expressions transform correctly, 
that is, in accordance with (4.9). 

4.2. Integration of exterior differential forms. It is, of course, possible to define 
the integral of a differential form of degree r over a submanifold of dimension 
r or a measurable subset thereof. However, for our applications we only require 
the integrals of differential forms of maximum degree, that is, of degree equal 
to the dimension of the manifold, n, and we shall restrict our definition to these. 

Expressed in coordinates, an exterior differential form of maximum degree 
has only one term 


(4.10) U(x”) dx? --+ dx? . 


As our integrals will be interpreted as probabilities, we require that, if U(x”) 
is not positive, it be replaced by its modulus. The domain of integration is 
divided into subdomains or cells C; each contained in an admissible coordinate 
system. The admissible coordinates map the cell C; into a domain C; in R” and 
the differential form, expressed in these coordinates, is then regarded as the 
integrand of an ordinary volume integral over the domain C; in R" and evalu- 
ated as such. Thus to integrate (4.10) over a subdomain C; in an admissible 
coordinate system, we take the multiple integral 


(4.11) [ | U(ef,.--+, 28) def ++ de® |. 
Cs 


The sum of the integrals over the (finite number of) subdomains, into which 
the total domain of integration has been divided, is defined as the integral of 
the differential form. The integral of the differential form does not depend on 
the subdivision of the domain of integration; that is, if a portion of the domain 
of integration has two admissible coordinate systems, say x? and z{, they give 
the same integral, namely 


| P | 
/ | U(x”) dx? --- dxi| = I U(z*) det (32!) dx{ +++ dx |. 
} ! 





MULTIVARIATE ANALYSIS 49 


according to the classical formula for the transformation of multiple integrals 
in Euclidean space. 

Corresponding to an exterior differential form w(p) of maximum degree on a 
manifold ¥(p ¢ ¥), there is a measure u given by 


(4.112) u(S) = [ w(p) ScxX. 
S$ 


In the general theory of integration of exterior differential forms, a difficulty 
arises as to which sign should be assigned to the integrals (4.11) over the compo- 
nent domains before taking their sum. It is connected with the orientation of the 
domains. However, as we are only going to integrate exterior differential forms 
representing probability densities, we have been able to avoid the difficulty by 
defining only positive integrals. Changing the sign of an exterior differential 
form does not alter its integral, as defined above. Hence we may ignore the sign 
of an exterior differential form of maximum degree and ignore questions of orien- 
tability of the manifolds. 

To integrate a function on the manifold with respect to the differential form, 
express it as a function of the coordinates z? , --- , x2 and include it under the 
integral sign in (4.11). 

4.3. Transformation of exterior differential forms. A one-to-one map of a mani- 
fold ¥ on another 9), induces maps of the functions on ¥ to functions on 9), 
measures on ¥ to measures on 9) and differei:tial forms on X¥ to differential forms 
on ¥). 

Suppose f is an analytic homeomorphism of the analytic manifold ¥ on an- 
other 9). By f being an analytic homeomorphism, we mean that the map f is one- 
to-one, and if 2, --+-, 2, are coordinates of p ¢ ¥ and y,,--+, Yn are coor- 
dinates of the image point q = f(p) e Y, then the y; are analytic functions of 
%,°**, , and the z; are analytic functions of y,,--+ , y. . Since f is cue-to- 
one, q and p are functions of each other. Put p = f~*(q). 

DeriniT10n. If ¢(p) is an analytic function on %, f induces a mapping of it to 
a function @(q) on ¥) given by 


(4.12) ¢(p) & &(q) = o(f'(q)). 


DerFinitTion. If uw is a measure on, f induces a mapping of it to a measure f 
on ¥) given by 


(4.13) yu 4, i 
where 
a(X) = u(f'(Z)) TcyY 


where f(T) denotes the inverse image of ZT, that is, the set of all points of ¥ 
mapped into T by f. 


DerrniTion. The image of a differential form under the mapping induced by f 





50 A. T. JAMES 


is obtained by replacing dz; by 2,(d2z,)/(dy;) dy; , the coefficient functions by 
their images under map f, and using the rules for exterior products. Thus 


é  # Us, ---4, (2) dz;, ee dz;, 
ie. | 
SD ug.a (Fy) (= oy av;) te (= oe, ays). 
faye ede 7 OY; 7 OY; 


This mapping could be carried out using arbitrary coordinate systems 
t1,°**,%, and y;,---, yn in ¥ and in Y respectively. It can easily be shown 
that mappings carried out in different coordinates agree with one another. From 
the definition of the mapping, it is clear that the map of a product of two dif- 
ferential forms is the product of their maps. The map of a differential form of 
maximum degree gives the map of the corresponding measure, according to the 

LemMA 4.2. If u is a measure on ¥ given by the differential form w(p), 


(4.15) u(S) = [ alg) cx, 


and f is an analytic homeomorphism of X on Y) which induces maps of u on ~ and 
w(p) on &(q), then the measure {i is given by the differential form &(q): 


(4.16) A(T) = [ @(q) TcyY 
z 
ProoF. By definition 
(4.17) at) = wa) =f wp). 
f~1(2) 
Suppose w(p) = g(x) dz, --- dz, . Then 


(4.18) nen (= : a) in (= = aw) 


= o(f"(y)) det () dy; se dyn ? 


‘ 
i 


and hence 


(4.19) [= [,a@ 


by the classical formula for the transformation of multiple integrals. (4.17) and 
(4.19) imply (4.16). Q.E.D. 

An important case is the mapping of a manifold ¥ on itself by an analytic 
homeomorphism f. If f maps a set © C ¥ onto a set T = f(S) C ¥ and an 
exterior differential form w into &, then (4.19) holds. The differential form w 
is said to be invariant under f if &(q) = w(q). In this case, by (4.19), we have 


(4.20) ln = [ «@ = [ -@. 





MULTIVARIATE ANALYSIS 51 


A measure u is said to be invariant under a transformation f if u(f~'(X)) = u(Z). 
(4.20) states that if a differential form is invariant, then the corresponding 
measure is invariant. 

While we have used admissible coordinate systems in defining and establish- 
ing the fundamental properties of exterior differential forms, in the practical 
applications we shall avoid them as they are usually complicated and difficult 
to handle. Indeed, the main reason for introducing exterior differential forms in 
this paper is to deal with measures on manifolds without the explicit use of ad- 
missible coordinates. The map of a differential form has been defined by use of 
coordinates, but t:° following lemma enables us to map them without coor- 
dinates. 

Lemma 4.3. Let f be an analytic homeomorphism of ar analytic manifold X upon 
another 9), which induces a map of an analytic function g(p) on ¥ to a function &(q) 
on Y). Then the differential form dg(p) is mapped on d (q). 

Proor. Let 21, --- , 2, be coordinates for p and y;, --- , y, coordinates for q. 
From the definition of a map of a differential form 


f Ox; 
L,Y Ss ay, 
Lo, Yi 
and thus 


(4.21) do aD Me, 5F ee ) -25 sv) =@d QED. 
Ox; ij 0%; 0 

The exterior differential forms comtaie the measures that we require on 
the manifolds will be constructed in the following way. The differential of an 
analytic function (see Section 4.1) on an n-dimensional manifold is a linear 
differential form and so are linear combinations of differentials of functions, the 
coefficients being analytic functions on the manifold. The exterior product of n 
such linear differential forms is a differential form of maximum degree and thus 
represents a measure. 

In this way we shali construct invariant measures on the orthogonal group, 
and the Grassmann, and the Stiefel manifolds. The invariance characterizes 
them uniquely (up to a muitiplicative constant) according to the 

TueoreM 4.1. If ¥ is a topological space and § is a transitive compact topo- 
logical group of transformations of ¥ onto itself such that HX is a continuous 
function of H and X into %, then there exists a finite measure u on ¥ invariant under 
H. uw is unique in the sense that any other invariant measure on ¥ is a constant 
finite multiple of wu. 

For our special applications we prove the existence of such invariant meas- 
ures by actually constructing them. For a proof of uniqueness see Weil [19]. 
Chevalley [4] gives an account of analytic manifolds and exterior differential 
forms on them in chapters 3 and 5, but from a more advanced and abstract 
standpoint. 


4.4. Repeated integrals. The topological product @ XK N of two analytic mani- 





52 A. T. JAMES 


folks Yt and Mis an analytic manifold. If 2, --- , tm and yi, --- , Ya are ad- 
missible coordinates of points r e Dt and s e N in domains D; C Mt and D, C MN, 
then 2%, °**,2m,¥Y1,°°* » Yn i8 an admissible coordinate system in D, X D, C 
mM * NR. Given differential forms w;(r) on Mt and we(s) on N and a function 
fir, 8) on M XK MN, then the exterior product of w; and w is a differential form 
on 2 & MN and we have 


[nf deirraxte) = [ wnle) [ $0, dant), 


for this reduces to the classical formula for repeated integrals when f and the 
differential forms are expressed in terms of the coordinates in D; and D,. A 
similar result holds for the whole manifold or any subdomain A C M XK MN, 


[ 16, 8)an(r)we(s) = [ we(8) [ Fr, 8)un(r), 


because A can be approximated by a union of product sets D; K D2. Ai(s) C M 
is the section of A at s and A, is the projection of A on 2. 

The importance, for us, of this result is that a manifold, which may be Eu- 
clidean space, is often homeomorphic to a topological product of manifolds, 
apart from a set of measure zero perhaps. And a differential form when trans- 
formed to a differential form on the product manifold often splits into the ex- 
terior product of differential forms on the component manifolds. Such trans- 
formations are useful for evaluating integrals of exterior differential forms and 
for deriving sampling distributions. 

4.5. The differential form for the invariant measure on the orthogonal group. 
Let A be an orthogonal matrix; 


(4.22 A'A = Iq. 


To keep the notation clear, we introduc. an abstract group manifold isomorphic 
to the group of orthogonal matrices, and denote its elements by Greek letters. 
We then regard the elements of our orthogonal matrices as functions on this 
abstract group manifold. Indeed, they are analytic functions. Let A(a) be the 
orthogonal matrix corresponding to the abstract group element a. However, the 
symbol H will be used to denote both a fixed orthogonal matrix and the cor- 
responding element of the abstract group; A(Ha) = HA(a). 

The differential of a vector or a matrix (such as A(a)) whose elements are 
analytic functions on the group manifold, is defined as the vector or matrix of 
differentials of the elements. Regarding A as a function of @ and differentiating 
(4.22) we have 


(dA(a))’A(a) + (A(a))’ dA(a) = 0. 


Thus (A’ dA)’ = —A’ dA. Hence A’ dA is a skew symmetric matrix of linear 
differential forms. The exterior product of the super diagona) elements gives us 
a differential form 





MULTIVARIATE ANALYSIS 


(4.23) w(a) = Tit; (a; da;) = Il%; (ayy day; + eee + Ani ddn;) 


where a; = a,(a) is the ith column vector of the matrix A(a). The differential 
form is of degree 4n(n — 1) = N and is thus of maximum degree. Hence it 
defines a measure yu on O(n) given by 


(4.24) u() = [ II a’ da, S C O(n). 


hr ‘ , ° ° , ° ° 
THEOREM 4.2. The differential forms a; da; and w(a) are invariant under the 
transformation 


(4.25) a—-a= Ha 
or equivalently 
A(a) — A(a) = HA(a). 


This transformation is called a left translation. 

Proor. Applying the definition given in (4.12) to the individual elements of 
the column vector a;(a), we see that the map (4.25) induces a map of a;(a) 
given by 


(4.26) aja) — 4;(a@) 


where the elements of the column vector 4;(@) are functions of & defined by the 
equation 4,(a) = a,;(H~'a) and this equals H’a;(a). By Lemma 4.3 applied to 
the elements of da;(a), we have 


(4.27) da;(a) — dé,(a&) = d(H'a,;(a)) = H'da;(a). 
Hence 
a;(a)’ da;(a) — G;(&)’ da;(&) 
(4.28) = a;(a)'HH' da;(a) 
= a,(a)’ da;(a), 


and this is the value of the differential form a,(a)’ da;(a) at a, that is, the dif- 
ferential form a,’ da; is invariant under (4.25). 

Since the transform of the product of “*fferential forms is the product of their 
transforms, it follows that w(a) is invariaut. Q.E.D. 

THEOREM 4.3. w(a) is invariant under a right-translation 


(4.29) a—-a = aH. 


Proor. The transform of the matrix A(a)’ dA(a) of differential forms is cal- 
culated as follows: 


A(a) + A(a&) = A(@H™) = A(&)H’, 
dA(a) — dA(a)H’ 





and 
A(a)’ dA(a) — HA(a)’ dA(a@)H’. 


The exterior product of the super diagonal elements of the matrix 
HA(a@)' dA(&)H’ will be the transform of w(a). To evaluate it, consider the 
transformation 


A(a)’ dA(&) + HA(a)'dA(&)H’. 


This is a linear transformation of the N = 4n(n — 1) linear differential forms 
aa)’ da;(&). If a vector of linear differential forms undergoes a linear trans- 
formation, then by Lemma 4.1, the exterior product of them is multiplied by the 
determinant of the linear transformation. To complete the proof we require the 

Lemma 4.4. If S is a skew symmetric matrix which we regard as a point in an 
N = 4n(n — 1) dimensional vector space and if L is a fixed matrix, then the trans- 
formation 


(4.30) S— LSL’ 


is a linear transformation of the vector space whose determinant is a power of the 
determinant of L. 

Proor oF LEMMA. The determinant of the linear transformation (4.30) is a 
polynomial, say p(L), in the elements of L. The transformation L,, is the same 
as L, and L, carried out successively. Therefore, by the muitiplication theorem 
for determinants (applied to the Nth order determinants) 


(4.31) p( LL) = p(L,)p(L2). 


But a polynomial p(L) in the elements of a matrix L whir* satisfies the equation 
(4.31) for all matrices L; and L, is a power of the ¢ ant of L (see Mac- 
Duffee [12] chap. 3). Therefore p(L) is a power of | L |. © ), 

Applying the lemma to the proof of the theorem, we .ee that the exterior 
product of the super-diagonal elements of HA(a)’ dA(&)H’ is the product of the 
super-diagonal elements of A(a@)’ dA(&) multiplied by some power of | H |, 
which is 1 apart from sign. Hence the transform of w(a) equals its value at a; 
that is, w(q@) is right-invariant. Q.E.D. 

Since w(A) is invariant under left and right translations, so is the measure u 
which is given by (4.24). 

Lemma 4.5. u is invariant under the transformation A — A =A’. 

Proor. Putting 


u(Z) = / w(A), ZC O(n) 
=I 


introduce a new measure v given by »(T) = u(X') where by XT" we mean the 


set of clements ot the orthogonal group whose inverses are in T. Then under 
the transformation A — HA we have A’ — A’H’. Thus when T undergoes a 
left translation, T~' undergoes a right translation. But u is invariant under right 





MULTIVARIATE ANALYSIS 55 


translations. Therefore » is invariant under left translations. From the unique- 
ness of invariaiit measures, »y must be equal to uw apart from a multiplicative 
constant, which will be unity since v(O(n)) = u(O(n)~’) = w(O(n)). The result 
holds for any compact group. 

Of course it is necessary to prove that the invariant differential forms which 
we construct are not identically zero. This will become evident when we obtain 
their integrals over the whole space. 

As an illustration, let us consider the invariant measure on the proper orthogo- 
nal group for the case n = 2. 


cos @ sin 6 
s-| 7 | 056<2« 
—siné cos @ 


and 


w(A) = a; da, = cos 6 d(sin @) — sin 6d(cos 6) = dé. 


4.6. The invariant measure on the Grassmann manifold. In the case of the or- 
thogonal group the differential form for the invariant measure was given by a 
single expression defined on the whole manifold. For the Grassmann and Stiefel 
manifolds this is not possible. Instead we construct a system of differential forms 
each defined locally. Their domains of definition together cover the whole mani- 
fold, and wherever they overlap, the differential forms are equal. The system of 
local differential forms is then regarded as a single global differential form. It 
represents the invariant measure. 

G,.,, is the space of k-planes p in R", as defined in Section 3.4. For points p 
in a neighbourhood of a point po ¢ Gi, let a; , -+- , a» and b;, +--+ , bax be or- 
thonormal column vectors spanning the plane p and its orthogonal complement 
respectively, such that the elements of these vectors are all analytic functions 
of p. Such a system of vectors can only be constructed locally. The invariant 
measure is given by the differential form 


(4.32) jot [Tis 5 day 


in the domain where the vectors a; and b; are defined. The system of all such 
expressions is a global differential form, denoted by us (p), which represents the 
invariant measure on the Grassmann manifold. 

There are three things to be prove 1; (1) that vectors such as a; and b; can be 
constructed in the neighbourhood of any point po ¢ G;,, ; (2) that the differential 
form (4.32) does not depend on the choice of the a; and b; , that is, that any two 
expressions of type (4.32) are equal wherever their domains of definition over- 
lap; (3) that ug is invariant under the transformations of G,,, induced by or- 
thogonal transformations of R”. 

(1) Take n fixed linearly independent column vectors in R” the first k of 
which span po. Take the orthogonal projections of the first k vectors on p and 


of the remaining n — k vectors on the orthogonal complement of p. For p ¢ ©,, , 





56 A. T. JAMES 


where ©,, is the domain‘ of G,,, for which the set of projections are linearly in- 
dependent, we can orthonormalize the projections on p by the Gram-Schmidt 
process to give orthonormal! column vectors a; , --- , a in p and orthonormalize 
the projections on the orthogonal complement of p to give orthonormal vectors 
bi, «++ , b»_» in the orthogonal complement of p. Then a,, +++ , a ,bi,-°~ , Dn—x 
are the required set of orthonormal vectors spanning p and its orthogonal com- 
plement respectively. 

(2) Suppose that in a domain D C G;,,, there are two expressions like (4.32) 
constructed from vectors a,;, b; and 4; , 6; respectively. Let A, B, A, B be the 
respective matrices with these vectors as their columns. Then there exist or- 
thogonal matrices H, and H; of respective orders k and n — k whose elements 
are analytic functions of p ¢e D such that 


(4.33) A = AH, 


(4.34) B= BH,. 


We can carry out the transformations (4.33) and (4.34) in two stages. 

Differentiate (4.33), dA = (dA)H,; + AdH,. Premultiply by B’, B’dA = 
(B’ dA)H, + 0 since the columns of A and B are orthogonal. Hence by Lemma 
4.1 


LT] in: bj da; = | H,| [T'.1 bj da; , 


and 


TUjat [ini bj da; = | Wy" *T [at [0h bj da; = []fat [1 f1 6} da; . 


In a similar way we can carry out the transformation (4.34) and we have 
- —k - f’ 3 ak o , 
(4.35) jal i=l b; da; = liz II‘. b; da; . 


Thus the differential form (4.32) does not depend on the choice of a; and b;. 
Hence the local differential forms (4.32) agree wherever their domains of defini- 
tion overlap and they can be considered as local expressions for a single global 
differential form uy . 


(3) Proof of invariance of vu; . Let H be a fixed orthogonal transformation: 
(4.36) p>q=Hp; p=H"g pqeGiy. 


Suppose a,(p) and b,(p) are the respective sets of orthonormal vectors which 
span p and its orthogonal complement, the elements of a;(p) and b;(p) being 
functions of p. Then 


a;(p) > 4(q) = a(H"q), bsp) + 6)(q) = b,(H™a), 
da,(p) — dé;(q) = da,(H~‘q), 
b,(p)’ dax(p) — b,(H™"q)’ da(H~'q) 
= b,(H~'q)'H'H da(H~'q) = (Hb,(H~"q))’ d(Ha(H~'q). 


* Dy», is almost all of G,,, . 


(4.37) 





MULTIVARIATE ANALYSIS 


Hence 


(4.38) [pct [T'-1b,(p)’ dap) — [nt TT*s (b)(H"q))’ d(Ha(H™a)). 


Since a;(H™'q) = a,(p) and b;(H™"q) = b;(p) span p and its orthogonal comple- 
ment respectively, it follows that Ha;(H™'q) and Hb,(H™'q) span q = Hp and 
its orthogonal complement respectively. Therefore the right-hand side of (4.38) 
is equal to uz (q). Thus the differential form uz (p) is transformed by H to a dif- 
ferential form which at q is equal to vp (q). Hence vg is invariant. Q.E.D. 

4.7. The invariant measure on the Stiefel manifold. V,,, is the space of ortho- 
normal k-frames in Euclidean n-space, of which we denote the typical member 
by ann X k matrix, A, satisfying the equation A’A = J, . As in the case of the 
orthogonal group, A’ dA is a skew-symmetric matrix. Choose an n X (n — k) 
matrix, B, whose columns are orthonormal vectors spanning the orthogonal 
complement of the plane spanned by the columns of A. As in the case of the 
Grassmann manifold, the elements of B must be analytic functions of admissible 
coordinates for A. The invariant measure on the Stiefel manifold is given by 
the differential form 


(4.39) [jot [0-1 6} da; TT *e; aj da; , 


which is defined almost everywhere on V;., . 

Expressions like (4.39) can be constructed in a set of domains which cover 
the entire manifold. They define a differential form wy which is of maximum de- 
gree, namely $k(2n — k — 1), and therefore represents a measure. (4.39) does 
not depend on the choice of B, the preof being similar to the one for G,,, , Sec- 
tion 4.6 (2). It is invariant under the transformation A — HA where H is an 
n X n orthogonal matrix. The proof is a combination of the proofs for O(n) 
and G,,,. The b; transform like the b; for the Grassmann manifold as in (4.37), 
while a; and da; transform like the corresponding quantities for the orthogonal 
group as in (4.26) and (4.27). 

Finally (4.39) is invariant under the group of transformations A —> AH where 
H isnowak X k orthogonal matrix. The proof is practically identical with that 
of Theorem 4.3. 


5. Integrals of the invariant measures. 


5.1. Integration of the invariant measure over the Stiefel manifold. We first con- 
sider the case k = 1 which is that of the unit sphere in Euclidean n-space. The 
column vector a is of unit length and can be regarded as a point on the unit 
sphere. b; , bg, --- , ba, are orthonormal column vectors orthogonal to a. We 
have to integrate the exterior differential form 


(5.1) liz b; da = w}(a) 


over the unit sphere. As we shall see, the differential form is really the element 
of area on the unit sphere. This can be shown by a direct transformation to 
spherical polar coordinates, as follows. 

Let xz, be the unit vector lying along the last coordinate axis and let q be the 





55 A. T. JAMES 


[n — 1]; plane perpendicular to it, which thus contains the first n — 1 coordinate 
axes. Let 6, be the angle between a ana z, and a be the unit vector lying along 
the orthogonal projection of a on q. 6 and @ are new “coordinates” for a, and 


(5.2) “@ = 2, COS & + asin O. 


If we exclude the points z, and —z, from the sphere, then @, has the range 
0 < 6, < 7, and a ranges over the unit sphere in the Euclidean n — 1 space, q. 
In fact, the unit sphere with z, and —z, removed, is the topological product of 
the ranges of 6; and a. 


To express the differential form, (5.1), in terms of 6, and a, choose b; in the 
2-plane spanned by a and z, and such that }, is perpendicular to a; thus put 


(5.3) b; = —2z, sin 0, + a cos 6, . 
Choose bo, --- , bs-1 in q, perpendicular to a. Differentiating (5.2) we have 
(5.4) da = (—2, sin 6, + a cos 6;) d0, + da zin . 


Since a is a unit vector and is perpendicular to z, , a’a = 1, a’z, = 0. 
Differentiating, 


(5.5) a'da = 0, da'z, = 0. 
Therefore, from (5.3), (5.4) and (5.5) 
(5.6) b, da = d0,. 
Since by , --- , b»-, are orthogonal to z, and a, 
b; da = bj da sin 4; 

Hence, if we repeat the procedure, 

[Dx 5; da = sin "~*0, do, [[ fat 0; da 
5.7) = sin "6, sin ” “6, +--+ Sin On-2 d0, dO, --- dOn-1 


0 < O31 < 27,0 <6 < r,t =1,---,n —&. 


Hence the differential form (5.1) is simply the element of area on the unit 
sphere. Integrating (5.7), we have 


(5.8) [ua Bids = A(n) 


where A(n) is the integral of (5.7) which is the area of the unit sphere in R": 
2x” 
T'(4n) ° 

In the general case, the invariant measure w, on V;,,, is represented almost 
everywhere by the differential form (4.39). 


(5.9) A(n) = 





MULTIVARIATE ANALYSIS 


THEOREM 5.1. 


k 
5.10) [ at =-Tl4@-i+n 
View t=] 
where A(v) is the area of the unit sphere in R’ given in (5.9). 
Proor. It is sufficient to prove that 


(5.11) [ we = A(n) ws » 
View Ve-1.n-1 
where wr; is the invariant measure on the Stiefel manifold Vy_1.,-1 of (k — 1)- 
frames in R"', because iteration of (5.11) gives (5.10). 
tewriting (4.39) in full, we have 


, , , , , 
w, = eda; az da, --- a,da, by da; --- ba_x day 


, , , , 
+ Gig day «++ Ay dade by dag «++ ba_, day 


, , 
° by da, 20 baz da, . 


The differential form in the first row depends only upon a, . In fact, by an argu- 
ment similar to the proof of (4.35) one can show that the first row remains un- 
altered if the vectors a,.,---, ae, bi1, +--+, ba« in it are replaced by any set 
of orthonormal vectors orthogonal to a; . Comparison with formula (5.1) shows 
that the first row of (5.12) is the element of area on the unit sphere V,,, given 
by the equation aja, = 1 in R". Denote it by w} (a). 

For a fixed a; , (a2, «++ , dy») range over all (k — 1) frames in the (n — 1)-plane 
perpendicular to a,. Denote this set of (k — 1)-frames by Vy_;,,-1(a,). The 
integral (5.10) can then be written as a repeated integral 


(5.13) [ o = [ w} (a) @ 
Ve Vi 


” ” Vk -~1.n—1 (@)) 
where @ is the differential form consisting of the last k — 1 rows of (5.12). Al- 


though & and the range V,_:,,-1(a;) over which it has to be integrated both de- 
pend on a, , nevertheless the integral 


(5.14) g(a) = w 
Vie —1n—1 (41) 


does not depend on a; , as we shall now prove. 
Let H be any fixed n X n orthogonal! matrix. Then 


g(a) = [ a; daz --~ bi da, --- 
Vi ~1.n—1 (@1) 


- | a, H’H da: --- 0, H'H dn --- 
Vi ~1.n—i (44) 





A. T. JAMES 


Ha, 
Ha; 
j; = Hb; j=l,--- 
As (a2, °° , a) ranges over Vy_4,n-1(a1), (G2, +++ , Ge) ranges over 
Vi-tn—1( Hay) = Vis n—1(). 


Hence 
(5.15) g(a) = / : i; da, «++ 6; da, --- = (a). 
Vie -~tm—1 (G1) 


Any unit vector a can be obtained from a, by a suitable choice of H. There- 
fore g(a;) does not depend on a; and is simply a constant ¢. 

Choose H so that 4, is a vector with its last coordinate unity and all others 
zero. Since d,, --- , b; --- are orthogonal to 4, , each will then have zero as its 
last coordinate. Hence ¢ can be seen to be the integral of the invariant measure 
on the space of k — 1 frames in R”’. Since ¢ is a constant, we can integrate 
over the unit sphere in the right-hand side of (5.13) giving (5.11), and the the- 
orem follows. 

5.2. Integration of the invariant measure on the orthogonal group. It is a special 
case of the integral for the Stiefel manifold. If A is an n X n orthogonal matrix 
then from Theorem 5.1 we have 


k 7 k - on’ !? 
(5.16) [ot = f Ta da, = TT A) = II oom. 


<J tml t= 


(5.16) gives the integral over the whole’ (improper) orthogonal group, that is, 
including the orthogonal matrices with‘negative determinant. The formula, 
(5.16), is consistent if we take the area of the unit sphere in R', which consists 
of only two points, namely +1, to be 2. 

5.3. Integration of the invariant measure on the Grassmann manifold. The dif- 
ferential form, (4.39), representing the invariant measure on the Stiefel mani- 
fold looks like a product of differential forms representing the invariant meas- 
ures on the Grassmann manifold and the orthogonal group. It suggests that 
the integral of the invariant measure on the Stiefel manifold should be the 
product of the integrals of the invariant measures on the Grassmann manifold 
and the orthogonal group, and such is, indeed, the case. Having evaluated the 
integrals of the invariant measures on the Stiefel manifold and the orthogonal 
group, we can thus find the integral of the invariant measure on the Grassmann 
manifold. 

If Ae V,,, isann X k matrix with orthonormal column vectors, the column 
vectors of A span a k-plane in R” which can be regarded as a point, p, in the 
Grassmann manifold G,,,(r = n — k). The k-frame is determined uniquely by 





MULTIVARIATE ANALYSIS 61 


the specification of the plane, p, and the orientation of the k-frame in p. To 
specify the orientation, introduce ancther “reference” k-frame, represented by 
the columns of an n X k matrix, H, in the plane p, the elements of H being 
analytic functions of p for almost all p. Then 


(5.17) A-= HC 


where C isa k X k orthogonal matrix. p and C are functions of A and the trans- 
formation A «> p, C is one to one where A ranges over almost all the Stiefel 
manifold, p over almost all the Grassmann manifold and © over the orthogonal 
group of order k. 

Differentiating (5.17) 


dA = HdC + dHC. 


It can be assumed that the n & (n — k) matrix, B, introduced in Section 4.7 
to construct the invariant measure on the Stiefel manifold, is a function of p 
alone. Since B’H = 0, 


(5.18) B’ dA = B’ dHC 
(5.19) A’dA = C'dC + C’H' dHC. 


Therefore 


(5.20) IL I] 6; da; = | C \"* TL [] 0) ahs = [LT] 8) ah, 


(5.21) [I i<j aj da; = []i<; cj de: + #dH 


where «dH signifies differential forms involving the elements of dH. The right- 
hand side of (5.20) is a differential form defined on the Grassmann manifold and 
is of maximum degree, while H is a function defined on the same space. There- 
fore, the product of any differential of dH with (5.20) is zero. 

Hence 


(5.22) TI I] 5; da: TT aj da; = [] [] 65 dhs T] c} de, 


and 


n k 
(5.23) K = [x wo [ui I] 0; ah, = for/ fat - X40 /ZAo, 


where A(v) is given by (5.9). 


6. Measures invariant under an induced group of transformations. 

6.1. Definitions. If f isa map of a space ¥ on a space 9) then fy (or fT) 
for y ¢ Y (or T C Y) denotes the inverse image of y (or T) that is, the set of 
all points of ¥ mapped by f into y (or T). We say that a measure u on X¥ is mapped 
by f on a measure f on 9) if, for every (measurable) set T C 9), a(T) = w(f'T). 

A many-to-one map f of a space ¥ on a space @ divides ¥ into a system of 
equivalence classes, each equivalence class being the inverse image fp of a 





62 A. T. JAMES 


point p ¢ @. The set of equivalence classes is thus in one-to-one correspondence 
with the points of G. Now if © is a group of transformations of ¥ on itself each 
of whose elements transforms each equivalence class onto an equivalence class, 
then may be said to induce a group of transformations of the space of equiva- 
lence classes. Since each equivalence class corresponds to a point in G, § thus 
induces a group of transformations of @. 

DeFINnITION. Let be a group of one-to-one transformations of a space ¥ onto 
itself and f be a map of ¥ on another space G. If for each p ¢ @ and each H ¢ 
there exists a point p; ¢ @ such that 


(6.1) H(f'p) = f''pr 


then we define the transformation H, of @ by the equation p,; = H,;p and say 
that the transformation § acting on X induces the transformation H,; on © 
and that the group induces a group ©, . 

LemMa 6.1. Suppose that f maps a space X on a space & and that a group 9 
of transformations of ¥ onto itself yields an induced group , of transformations 
of © onto itself. Let u be a measure on ¥ mapped by f to a measure ~ on O. Then 
if uw 18 invariant under $, ~& is invariant under , . 

Proor. For a measurable subset T C @ 


a{H,T} = wlf7H,T) = wf Af} = wif I} = a{(T}. 


As a very simple illustration of the lemma, let ¥ be the Euclidean 2-plane and 

the group of rotations of it. Let u be a finite measure on ¥ invariant under 
° * . ° ~~(z? 2) 

, that is, circularly symmetrical, for example ¢€ *!"*?’'dx,dz2. Introduce polar 


coordinates (r, 9) in the plane. Let G be the unit circle with @ as its coordinate 
and let f map a point (r, 6) in the plane, on 6. f maps the measure u on a measure 
a in @. Then the group § induces a group of rotations of the unit circle, under 
which, by Lemma 6.1, the measure 4 must be invariant. Thus u is the uniform 
measure on the circle. We give several applications of the lemma. 

6.2. Distribution of the plane spanned by a set of random vectors. Let X be 
an n X k matrix whose rows are n independent observations from a normal 
k-variate distribution with means zero, that is, with the distribution (2.1). As 
pointed out in Section 2 the distribution is invariant under the orthogonal 
group of transformations (2.2). Consider the columns of X as k vectors in Eu- 
clidean n-space namely, 7,--:, x, and let p = f(x) be the plane spanned 
by them. As, with probability one, x;,--- , 2 will be linearly independent, 
the plane will be k dimensional. Thus f is a map from the space of n K k ma- 
trices, ¥, to the Grassmann manifold G;,, (r = n — k). The orthogonal group 
of transformations of ¥ induces a group of transformations of G,,,. Hence, by 
Lemma 6.1, the distribution of p is invariant under the induced group of trans- 
formations. According to Sections 4.3 and 4.6 the invariance characterizes the 
distribution of p uniquely as the invariant measure on the Grassmann manifold, 
and the probability density is given by the differential form 


(6.2) K™ T[ iat [is b5da; . 





MULTIVARIATE ANALYSIS 63 


The invariance of the distribution of p was recognized by Hotelling [8]. The 
type of argument, given above, to prove the invariance has been used by T. W. 
Anderson in other connections. 

6.3. Relation to the invariant measure on the orthogonal group. As a second 
application of the lemma we show how the invariant measures on the Grass- 
mann and Stiefel manifolds may be derived from the invariant measure on the 
orthogonal group. It was shown in Sections 3.3 and 3.4 that the Grassmann 
and Stiefel manifolds may be regarded as coset spaces of the orthogonal group. 
Let po be a fixed k-plane in R", (thus po ¢ G;,-) and A ¢ A an invariantly dis- 
tributed orthogonal matrix. The matrix A transforms R” into itself and induces 
a transformation of G,,, into itself. The mapping 


(6.3) A—Ap =p 


from % to G,,, maps the invariant measure on & to a measure on G;,, which, 
by Lemma 6.1 must be invariant. 

The representation of a homogeneous space in terms of the group is very 
useful because the group has more symmetry, namely, a group element can be 
transformed from both left and right by other group elements and also the 
inverse can be taken. The representation of the invariant measure on the Grass- 
mann manifold in terms of the invariant measure on the orthogonal group will 
be used in deriving the distribution of the canonical correlation coefficients. 

The invariant distribution on the Grassmann manifold was obtained above 
by a random transformation of a fixed plane po by an invariantly distributed 


orthogonal matrix. The result still holds if pp has an arbitrary probability dis- 
tribution provided it is independent of A. 

THeoreM 6.1. Jf po is a random point in the Grassmann manifold with an arbi- 
trary probability distribution and A is an independently invariantly distributed 
orthogonal matrix and if 


(6.4) >= A Do 


then » is invariantly distributed in the Grassmann manifold. 

Proor. Suppose po € G, p € G,,- and A ¢ YW. (6.4) is a map of A XK G onto 
G,,, . The joint distribution of the pair (A, po) in & XX G is invariant under the 
group of transformations 


(6.5) (A, Do) =? (HA, Do) 


where H is an orthogonal matrix or transformation. But the transformation 
(6.5) induces the transformation p — Hp in G;,,. Hence by Lemma 6.1, p is 
invariantly distributed. Q.E.D. 

In a similar way one can show that the invariant measure on the Stiefel 
manifold can be obtained by a random orthogonal transformation of a fixed 
k-frame, or even of a random k-frame provided it is distributed independently 
of the orthogonal matrix. 

6.4. Critical angles between two planes. 





64 A. T. JAMES 


THEOREM 6.2. If p and q are planes of dimension p and q respectively, p S 4, 
in Euclidean space R” and if @ is the angle between an arbitrary vector a in » and 
an arbitrar,: vector a in q, then as a and a vary over » and q respectively, 6 has p 
stationary values, 4x 2 6; 2 0 2 --+ 2 4, 2 O corresponding to pairs of vectors 
Say @, %,°**, ay, ap. The stationary cngles 0; are uniquely determined by 
p and q and if no two of them are equal, the corresponding vectors are uniquely 
determined apart from length and a simultaneous reversal of direction of a; and 
a;. a; 18 orthogonal to a; and a;(t # 7). 

The angle between a; and a; is, of course, 6;. These angles are called the 
critical angles between the planes p and q. For a proof of the theorem see Hotel- 
ling [8] or Roy [17]. 


7. Application to the distribution of the canonical correlation coefficients 
and the roots of certain determinantal equations. If the rows of the 
matrix [X:Y] = [x --+ rey: +--+ Ye] are n independent samples from a (k + 4q)- 
variate distribution, with means all zero, and if p and q are the planes spanned 
by the column vectors 2, -°-- , 2% and y;,-°-- , Yq respectively, then Hotelling 
[8] (see also Roy [17]) showed that the sample canonical correlations between 
X and Y are the cosines of the critical angles 6, , --- , 0; between p and q where 
| = min (k, q). Denote (6,, +--+ , 6:) by Z(p, q). 

The canonical correlation coefficients are often expressed as the roots of a 
determinantal equation. Let X, be the n X k matrix whose ith column is the 
orthogonal projection of the ith column of X on the plane spanned by the column 
vectors of Y. Then the roots of the determinantal equation | X,X, — \X’X | = 0 
are the squares of the canonical correlation coefficients, cos 6; . The same problem 
also arises from multivariate analysis of variance (at least, in the null case). 
For this problem Y is fixed instead of random, for example, Y can be taken as 
the matrix whose column vectors represent the first g coordinate axes. 

The distribution of the canonical correlations for samples from normal popu- 
lations in the null case, was found simultaneously by Fisher [5], Hsu [9], Roy 
[16] and Mood [13], in 1939. Let us illustrate the application of Grassmann and 
Stiefel manifolds by giving yet another derivation! 

For the null case, that is, when X and Y are independent, Fisher has pointed 
out that the assumption that Y is normally distributed can be dropped. Thus 
assume that the rows of X are independent samples from a k-variate normal 
distribution (with means zero) and that Y has any arbitrary distribution inde- 
pendent of that of X. In Section 6.2 it was proved that the plane p is invariantly 
distributed. Since Y is independent of X, q is independent of p. From the joint 
distribution of p and q we derive the distribution of the critical angles, Z(p, q). 

The distribution of Z(p, q) remains the same if the random plane q is replaced 
by a fixed plane. We prove this by representing the distribution of p in terms 
of the distribution of an invariantly distributed orthogonal matrix as shown 
in Section 6.3. Let A ¢ U be a random orthogonal matrix distributed invariantly 
and independently of q. Let po be an arbitrary fixed k-plane. Then App is a 





MULTIVARIATE ANALYSIS 65 


random k-plane distributed invariantly and independently of q and thus the 
joint distribution of Ap» and q is the same as the joint distribution of p and q. 
Hence Z(App, q) has the same distribution as Z (p, q). 

Since the critical angles between two planes are invariant under their simul- 
taneous orthogonal transformation, we have, on multiplying Ap) and q by A™ 


(7.1) Z(Apo, a) = Z(p, AQ). 


But by Lemma 4.5, A~ has the same distribution as A; therefore Z(Apo , q) 
has the same distribution as Z(p), Aq). Since, by Theorem 6.1, Aq is invari- 
antly distributed, it follows that Aq has the same distribution as Aq , qo being 
an arbitrary but fixed q-plane 

We have now proved that the distribution of Z(p, q) is the same as the dis- 
tribution of Z(~», Aqo) which, again, is the same as that of Z(Ajpp, qo). We 
can use whichever is the more convenient. 

CasE 1. n 2 k + q. Let us choose the plane of the smaller number of dimen- 
sions as the random plane. Suppose it is App (hence k S q) which we shall now 
denote by p. Were k > gq, we could simply use Z (po , Aqo) instead of Z (App , qo). 
Choose q as the plane spanned by the first q coordinate axes. 

If we take arbitrary orthonormal vectors a; , --- , a in p and n — k arbitrary 
orthonormal vectors b;,---, ba. in the orthogonal complement of p, then 
according to Sections 4.6 and 6.2 the distribution of p = Ap, is given by the 
differential form (6.2). 

We shall see that apart from a set of measure zero, the Grassmann manifold 
is analytically homeomorphic to the topological product of a simplex in R”, 
over which the critical angles range, and two Stiefel manifolds. By transforming 
the differential form to a differential form on the product space and integrating 
over the two Stiefel manifolds, we find the distribution of the critical angles. 

According to Theorem 6.2 the vectors in p which make the critical angles with 
q are uniquely determined by p apart from length and reversal of direction, 
provided no two critical angles are equal and no critical angle is 0 or $7. Let us 
exclude these exceptional cases as they have measure zero. 

The orthonormal column vectors a;,---, a, in (6.2) can be chosen arbi- 
trarily in p. Let them be the vectors which make the respective critical angles 
with qo and such that the first component of each a; is positive. Such: conditions 
determine (a; ,--~ , a) uniquely as analytic functions of p for almost all p, 
that is, on a set G of p excluding merely a set of measure zero. Let a, ,--- , a 
be unit vectors in qo which make the respective critical angles with p. Then 

- , & are mutually orthogonal and lie along the respective projections of 
>, a on qo. Let 8, --- , 8 be the orthonormal vectors lying along the 
respective projections of a; , --- , a, on the orthogonal complement of qo . Thus 


aja; = bi; ’ B:B; = 8:3, a8; = 0, J = l, ae k 


a; = a; cos 6; + B; sin 8; Gm l,--> ik. 


’ 





66 A. T. JAMES 


Since (a,,--- , a,) is a k-frame in the q-space dq» it can be regarded as a point 
in a Stiefel manifold. Let 7;,,, denote the part of this Stiefel manifold over which 
(a,,-**, a) ranges. From the condition imposed above upon a,,---, a, 
it follows that this will be the set of all orthonormal (a; , --- , a.) such that the 
first component of each vector is positive. On the other hand, (8; , --- , 8.) will 
range over the whole Stiefel manifold V,,,_,. Let @ be the set of (0, --- , %) 
such that jr > 6 > &>--->& > 0. 

Thus p ¢ G determines (0, , --- , %&) €@, (a1, --- , ax) € Ve,g and (8, --- , Be 
Vin-g uniquely, and conversely p is determined by these, because by (7.2) 
they determine a set of vectors a, , --- , a, which span p. The transformations 
are not only one-to-one but also analytic; hence G is analytically homeomorphic 
to the topological product of 6, V.,, and Vi... By a suitable choice of b, --- , 
bn» we express the differential form (6.2) on G as a differential form on the 
product space. 


Differentiating (7.2), we have the relations 
a;da; = 0, 6; dB; = 0 
(7.3) a;da; = —a;da;,, 8, d8; = —B; dp; ij i,j 
da; (—a; sin 0; + B; cos 6;) dé; + da; cos 0; + dB; sin 6; 
; 
Since a; and 8; lie in fixed mutually orthogonal planes 
(7.4) a; dB; = Bj da; = 0 j= 


Now (b;, «++ , b»-«) is an arbitrary set of orthonormal vectors in the orthogonal 
complement of p. By choosing 


(7.5) b; = —a; sin 0; + 6; cos 0; 
We have 
(7.6) b; da; = dé, 


and 


(7.7) b; da; = —a; da; sin 0; cos 0; + 8; dB; cos 8; sin 6; 


Ses 65m i, --- 8, 


By using the relations (7.3) and remembering to change the sign when re- 
versing the order of two linear differential forms, and remembering that any 
term containing a repeated linear form is zero, for example, (a; da;)(a;da;) = 0, 
we calculate that 


(7.8) (b,da;)(bjda;) = (ajda;)(8jdB;) (cos* 0; - cos’ 6;) 


im; (3 





MULTIVARIATE ANALYSIS 


Choose the orthonormal vectors bk4: , --- , bg in a perpendicular to a, --- , 
a, and choose bo4:, --* , Dax in the orthogonal complement of q perpendicular 


to Bi, °°: , Be. Then b,,--- , ba» are orthonormal and span the orthogonal 
complement of p. Furthermore 


, (bj da; cos 6; i ek j=k+1, 
(7.9) bj;da;j = 4 , 
\b; dB; cos 6; ¢=1,---,k p=qtil,---,n—k. 


Multiplying (7.6), (7.8) and (7.9) according to the rule of the exterior product, 
we see that (6.2) becomes 


KT WD 0) da, = KW a5 do. TT WY 0 dew 11 63 a8. 10 


tm] jel i<j t=] jek+ 1 tml juxg+l 


bi dB, (II ins a) (I ae w) ie «cabs te. 


i=] i=l i<j 


(7.10) 


Thus a,-+--, a and 6,,--:, & are invariantly distributed k-frames in 
dg and the orthogonal complement of qo respectively. Hence the invariant dis- 
tribution on the Grassmann manifold can be transformed into three independent 
distributions, namely, the distribution of the critical angles that the plane 
makes with a fixed plane, and two invariant distributions in Stiefel manifolds. 
Since the restriction on the a; implies that the first component of each a; is 
positive, the a , --- , a range over the (2 "\th part of the Stiefel manifold while 
8,, °°: , Be ranges over the whole Stiefel manifold. 

Therefore by Theorem 5.1 


(7.11) [1 3 das TT Il bj da; = 2" [] A(g —i + 1) 
<j ial jork+1 im] 
and 
(7.12) [ 16; a3. 11 a v5 a8, = TL Ain — 9-1 +1). 
i<j i=l jagq+I 
Hence the distribution of the critical angles is 
k q-k k n—q~k k 
7.13) K(n, k, q) (II cos a) (II sin a) II (cos” 6; - cos” 4;) de, eee dé, 
i=! i=l i<j 


where 


. ( — oe _ - 
7.14) K(n,k,q) = II A(k — i+ vale — cf Ate q-i+1) 


and 


Qn"? 


(5) 


The distribution of the canonical correlations is found by putting r; = cos 6,. 


A(n) = 





68 A. T. JAMES 


Case 2.q <n < k + q. The planes p and q must intersect ina k + q — n 
dimensional space; therefore k + q — n of the critical angles will be identically 
zero, leaving only 6,,--- , 6,-, different from zero. 

CasE 2a. k S q. 

Step I. According to (7.1) we can take a fixed plane pp and a random plane q 
instead of p and q . If a, --~ , a, arechosen in q andb,, --- , b,—, in the orthog- 
onal complement of q, then the distribution of q is given by 


K~* [fa [1 pat 6} day’. 


Step II. But this differential form equally represents the distribution of the 
orthogonal complement, q*, of gq, which has dimensions n — q < k. Indeed 


(7.15) _" Il TI bj da; = K™ TI Il a; db;. 


tml joel tml jal 
a hl “4s a” * 
The critical angles 6; , --- , 0,-, between q* and py» are the complements of the 
nonzero critical angles between q and po, that is 


(7.16) 


We can carry out the preceding analysis, interchanging the roles of a; and b;. 
Using the correspondences 

Old New 

n n dimension of space 


k dimension of random plane 
q I, dimension of fixed plane 

° - ° ° e ° * 

from (7.13) we obtain the distribution of the 6; : 


n— kt+q-n /n— qk 
K(n,n — q, k) (TE cos ot) (i sin ot) 


i=l t=1 
ba ” 9 
, ll (cos’ 67 — cos" 65) dey --- 
<j 
and putting 6 = 4% — 6,, the distribution of 6; : 


n—< k+qn-n n— qak 
K(n,n — q,k) ( TI sina ) (TI cos; 
1 


= t= 


7.17) 


. ll (cos” 6; — cos’ 6;) dd; --- a 


<) 


Case 2b. k 2 q k<n<k-+q. Weonly require Step II. That is we take 
the orthogonal complement of p, instead of p, as the random plane. Hence 
the correspondences are 





MULTIVARIATE ANALYSIS 


dimension of space 
dimension of random plane 


dimension of fixed plane. 


The distribution of 67 is 


n—k k+q-n n—k —@ 
K(n, n — k, q) (II cos 0) (II sin at) 


tow] i=l 


n—k 
- [I (cos* of — cos* 67) det --- do%_, 
<j 


and hence of the @; is 


n—k k+q~rn n—k k—g 
K(n, n — k, q) (II sin %) (II cos 6) 


t=] j=l 


(7.18) ca 
- TT (cos® 6; — cos® 6;) dd; +++ d0n-x. 


<j 
As before, the distribution of the canonical correlation coefficients is obtained 
by putting r; = cos 6;. 


8. Decomposition of the distribution of a normal multivariate sample. 

8.1. Introduction. The distribution of n independent observations from a 
univariate normal population with zero mean and unit variance can be split 
into two independent distributions by transformation to spherical polar co- 
ordinates; namely, the x’ distribution and the invariant distribution of a vector. 
The latter can be expressed in terms of the element of area on the unit sphere 
in R". This result is useful in deriving the various sampling distributions. 

By using the exterior differential forms (see Sections 4.5 to 5.2), for the in- 
variant measures on the Grassmann and Stiefel manifolds, we shall derive the 
multivariate analogue of this decomposition. Let X be ann X k matrix (k S n) 
whose rows are n independent observations from a normal k-variate population 
with means zero and variance covariance matrix 2. X is distributed as in (2.1) 

THeEoreM 8.1. The distribution (2.1) of a normal k-variate sample can be de- 
composed into three independent distributions 

a. essentially the Wishart distribution, 

b. the invariant distribution of the plane spanned by the vectors x1, +--+ , 2%, 

c. the invariant distribution of the orthogonal k K k matrix which determines 
the orientation of x,,-°-- , 2 in the plane they span. 

The process of decomposition yields, incidentally, the distribution of the 
latent roots of the sample variance covariance matrix, a distribution found by 
Fisher [5], in the special case that all the population latent roots are equal. 

Let X be ann X k matrix distributed as in (2.1). We can put 


(8.1) X=AL@ 





70 A. T. JAMES 


where 
1. A is ann X k matrix such that 


(8.2) A‘A= Kk, 


2. Lisak X k diagonal matrix with 1, > , > --- > i > 0 down the diagonal, 
li, -->, KG being the latent roots of X’X, 

3. Gisak X k orthogonal matrix with the elements in the first row positive. 
Equation (8.1) holds for almost all X and determines A, L, and G uniquely. 
To obtain (8.1), let G be the matrix satisfying Condition 3, which reduces 
X’'X to diagonal form L’, that is, such that X’X = G L’G’. Putting A = X GL’ 
yields (8.1) with A satisfying (8.2). (8.1) implies that the Euclidean space R™ 
of matrices X is, apart from a set of measure zero, analytically homeomorphic 
to the topological product of a Stiefel manifold V;,, , a simplex in R* and part 
of an orthogonal! group manifold, over which A, L and G range respectively. We 
now express the volume element [[dz;; of R™ as an exterior product of differ- 
ential forms on these manifolds. 

Differentiate (8.1) 


(8.3) dX =dAL@+AdL@+ALd’. 


Choose an n X n — k matrix B such that the partitioned matrix [A | B] is orthog- 
onal. Premultiply (8.3) by the transpose of [A | B] and post-multiply by G; 


A’ : A'dA dL L dG’'G 
BA “es E A + ] * ! 0 | 


(8.4) 
“ Ez + dL — LG’ | 
” B'dA L 


since G’ dG, like A’ dA, is skew symmetric (cf. Section 4.5). 

To evaluate the exterior product of the left-hand side of (8.4), consider first 
a single column dz; of dX. By Lemma 4.1 the exterior product of the elements 
of the transformed vector [A | B)‘dz; is| A | B | [2 dz,; . Hence the exterior 
product of the elements of the matrix [A | B)/dX is| A | B|* T],,; dz:;. In the 
left-hand side of (8.4) the row vectors of the matrix [A | B]’ dX are transformed 
by G. Hence the exterior product of the elements in a single row will be multi- 
plied by | G | and since [A | B)’dX has n rows, the exterior product of all the 
elements is multiplied by | G |". Therefore the exterior product of the elements 
in the left-hand side of (8.4) is 


(8.5) |A|B\*|@\"[]ij dx; 
which equals [],,; @x;; since [A | B] and G are orthogonal matrices. 


The exterior product of the (ij)th and (ji)th elements of the matrix on the 
right-hand side of (8.4) is 


(8.6) (a; dajl; — lig, dg;)(a; dal; — lig; dg:) = (a, da;)(g; dg;)(U; — 0). 
tv@j 14t3=1,°°-,k 





MULTIVARIATE ANALYSIS 71 


By considering that the row vectors of the matrix B’ dA are transformed by L 
we see that the alternating product of the elements of the matrix B’ dA L is 


(8.7) [| "* [0s [Dian 65 da, 


where 6; is the jth column of B and a, is the ith column of A. Therefore, from 
(8.5), (8.6) and (8.7) 


k n—-k k hk 
Naina (II i) I] @ — tah --- ak TL gag, 


tel 


<) <) 
(8.8) ee 


k n—k 
, , 
-[] aj da; [] [] 0} da,. 
<J) jel tml 
This is an interesting decomposition of the volume element of the nk-dimen- 
° . & re ° ° 
sional Euclidean sample space.’ The differential form 


k 
(8.9) II gi dg; 
<J 
is the invariant measure on the orthogonal group, which was discussed in Sec- 
tion 4.4, and the differential form 


k 


k A n 
, , 
(8.10) II a; da; TT [I 6} aa, 
<j tml jal 
is the invariant measure on the Stiefel manifold V;,., of 4-frames in R" ‘see 
Section 4.6). The decomposition leads immediately to the distribution of the 
latent roots of the sample variance covariance matrix. 

8.2. Latent roots of the variance covariance matrix. The distribution of the latent 
roots, Ij, --- , i, is found by integrating over the Stiefel manifold and over the 
group of orthogonal matrices. Since the density function in (2.1) does not depend 
upon A, the integral over the Stiefel manifold can be evaluated separately and 
is given by (5.10). 

Let C be the k X k orthogonal matrix which reduces 2 to diagonal form 

M1 
(8.11) C’sC =A = 
Ae 
The columns of C are linear functions which give the extremal variances in the 
population. Since the columns of G are ‘the linear functions which give the ex- 


tremal variances in the sample, G is the maximum likelihood estimate of C. 
From (8.11) and (8.1) 


k ¥ y 
(8.12) a! = > Sate, 
val v 


(8.13) = > ginginl. 


5 This corresponds to the result given by Olkin [14] p. 29 





72 A. T. JAMES 


Hence from (2.1), (8.8), (5.10), (8.12) and (8.13) we get the joint distribution 
of the latent roots Ij, --- , i of X’‘X and the linear functions G; 


k 
| []Am-—i+1), 
dF(i,--+,4;G) = corn Bh 


+ 


(8.14) (Qa)"*’* 2* II ye/? i<j 


i=l 
k d i (n—k—1) k a 
(I it) dl; ee dl? eft iiee Cin Dip ein Gin ifr; II 9; dgs 
t-} <j 
where A(v) is given in (5.9). 
To obtain the distribution of the latent roots of X’X alone, put S = C’G. 
Being the invariant measure, 


Tic; 9 dg; = ITt.; 8; ds;. 


By dividing (8.14) by 2", we can drop the restriction that the elements in the 
first row of G must be positive and thus let G and hence S range over the whole 
(improper) orthogonal group. The distribution of the latent roots of X’X is then 


dF(G,-:-,k) = =- ; 
(2x) 9° II i" 


II A(n-i+t+t1) ;. §(n—k—-1) _k 
mainte es Ce 
iw] <j 


k 
—djy4 2,-1,2 , a 
(fe en rom TT os; as,) dl -++ dl 
<j 
the integral being taken over the orthogonal group. 
In the special cases 
a. When the population latent roots, \; , are all equal, the exponential term 
in the integral becomes 


—(1/2ryz* 13 
eo MEY 


that is, independent of S, and we only have the invariant measure on the orthog- 
onal group whose integral is given by (5.16). This distribution was found by 
Fisher. 

b. When n = 2, we can put 


cos@ —sin @ 
S = : 
sin 6 cos @ 


82 ds, = dé 
and the integral is expressible as an imaginary Bessel function of zero order as 
given by Girshick. 
8.3. The Wishart distribution. The variables 1, , --- , l, ,@, can be expressed 
. »? . 
in terms of X X. 





MULTIVARIATE ANALYSIS 73 
From (8.1) X’X = G L’ G’. Differentiating and pre- and post-multiplying by 
G’ and G respectively 
(8.15) G’ d(X'X) G = G@’ dG L’ + L’ d@’ G + 2L ab 
= G@' dG L* — L’G@’ dG + 2L aL. 
The (7, j)th element of the matrix in the right-hand side of (8.15) is 
gi dg; (5 — Ui) 
21; dl; 


(8.16) 


To evaluate the alternating product of the diagonal and super diagona! elements 
of the left-hand side of (8.15), notice that 


d(X'X) — G@’ d(X’'X) G 


is a linear transformation of d(X’X), regarded as a vector in a space of dimension 
4k(k + 1). The coefficient of IDs; d(x;x;) will be the determinant of this linear 
transformation, which, by an argument similar to the proof of Lemma 4.4, is 
proved to be a power of | G | which is 1. Hence the exterior product of the diago- 
nal and super diagonal elements of the matrix on the left-hand side of (8.15) is 


(8.17) IIi=; d(ziz,). 
(8.17) and (8.16) give 


k k k k 
(8.18) I] d(zix;) = 2 (II i) IL G — 0) II gi dg; dy «++ dh. 
; i<j i<j 


tsi tml 


Using (8.18) te substitute for [ Te; g: dg; in (8.8) yields 
k k n—k k 
(8.19) II dx; = 2* iraor yf IL d(z‘x,) [I ai da; [] II 0} da,. 
+,3 isj i<j jul tol 


Adjoining of the density factor of (2.1) and integration over the Stiefel manifold 
(given by (5.10)) yield the distribution of X’X which is essentially the Wishart 
distribution: 


|Z eo “ rey jb(n—k—1) 
I] A(n — » + 1) | X’X | 


dF(X'X) = (Qn)ie ge AF 
e itr(e- tx’) II d(x';2;) 
isi 


(8.20) 


where A(v) is given in (5.9). 

8.4. The‘ general decomposition. Formula (8.19) shows that the distribution 
2.1) of a normal multivariate sample can be decomposed into two independent 
distributions, a Wishart distribution and an invariant distribution on the 
Stiefel manifold. 

To split off the distribution of the plane spanned by the columns of X, we 
decompose the differential form 





74 A. T. JAMES 


(8.21) Ja} da; TT T] bide; 


for the invariant measure in the Stiefel manifold into two differential forms 
representing independent distributions, namely, the invariant measure on the 
Grassmann manifold and the invariant measure on an orthogonal group of order 
k. The decomposition is given by equation (5.22). 

From (5.22) and (8.19) we have the complete decomposition of the distribu- 
tion (2.1): 


ame 


k 
| —$tr(Z~1x’x) viv \a(n—k—1) ae 
(2n)i™* Qe é | a | IT d(x; xj) 


dF(X) = 
(8.22) 


n~k k - 


- TT I] 85 dh, [Te dex, 


j=l im] <I 


which is the result stated in Theorem 8.1. 

In the univariate case, the Wishart distribution becomes the x’, the invariant 
measure on the Grassmann manifold becomes the element of area on the unit 
n-sphere, and the third factor disappears. 


9. Acknowledgements. The author expresses his gratitude to Dr. H. Schwerdt- 
feger, of Melbourne University, to Dr. E. A. Cornish, of the C. 8. I. R. O. Ade- 
laide, and to Professors 8. 8. Wilks and W. Feller of Princeton University for 
their help and encouragement. 


REFERENCES 
M. 8. Barrett, “The vector representation of a sample,’’ Proc. Cambridge Philos. 
Soc., Vol. 30 (1934), pp. 327-340. 
W. Brascuke, Integralgeometrie, Actualités Scientifique et Industrielles, Vol. 252, 
Hermann, Paris, 1935. 
8. S. Cuern, “On Grassmann and differential rings and their relation to the theory of 
multiple integrals,’’ Sankhyd, Vol. 7 (1945), pp. 1-8. 
C. Cuevauiey, The Theory of Lie Groups, Princeton University Press, 1946. 
R. A. Fisner, ‘“The sampling distribution of some statistics obtained from non-linear 
equations,’’ Ann. Eugenics, Vol. 9 (1939), pp. 238-249. 
M. A. Grrsuick, ‘‘On the sampling theory of the roots of determinantal equations,’’ 
Ann. Math. Stat., Vol. 10 (1939), pp. 203-224. 
] E. Goursat, Legons sur le Probleme de Pfaff, Hermann, Paris, 1922, chap. 3. 
H. Hore.uina, ‘‘Relations between two sets of variates,’’ Biometrika, Vol. 28, (1936), 
pp. 321-377. 
{9} P. L. Hsu, “On the distribution of the roots of certain determinantal equations,” 
Ann. Eugenics, Vol. 9 (1939), pp. 250-258. 
{10} A. Hurwitz, Uber die Erzeugung der Invarianten durch Integration (1897) Mathe- 
matische Werke II, Birkhiuser, Basel (1933), pp. 546-564. 
[11] E. KAnver, Einftihrung in die Theorie der Systeme von Differential-gletchungen, Chelsea, 
Berlin, 1934. 
{12} C. C. MacDurreer, Vectors and Matrices, Math. Assoc. Am., 1943. 
[13] A. M. Moon, “On the distribution of the characteristic roots of normal second-moment 
matrices,’’ Ann. Math. Stat., Vol. 22 (1951), pp. 266-273. 
{14) I. Orkin, “On Distribution Problems in Multivariate Analysis,”’ Institute of Statistics, 
Mimeo. Series No. 43, 195i. 





MULTIVARIATE ANALYSIS 75 

[15] I. OLKIN AND W. L. Dzemer, ‘“‘The Jacobians of certain matrix transformations useful 
in multivariate analysis,’’ Biometrika, Vol. 38 (1951), p. 345. 

[16] S. N. Roy, “‘p-statistics or some generalizations in analysis of variance appropriate 
to multivariate problems,’’ Sankhyd, Vol. 4 (1939), pp. 381-396 

[17] 8. N. Roy, “‘A note on critical angles between two flats in hyperspace with certain 
statistical applications,’’ Sankhyd, Vol. 8 (1947), pp. 177-194. 

[18] N. Sreenrop, The Topology of Fibre Bundles, Princeton University Press, 1951. 


{19} A. Wer, L’Intégration dans les Groupes Topologiques et ses Applications, Actualités 
Scientifique et Industrielles, Hermann, Paris, 1953 





THE MAXIMA OF THE MEAN LARGEST VALUE 
AND OF THE RANGE 
By E. J. GumBe. 
Columbia University 

Summary and Introduction. R. L. Plackett derived the maximum of the ratio 
of mean range to the standard deviation as function of the sample sizé, and 
gave the initial (symmetrical) distribution for which this maximum is actually 
reached. On the other hand, Moriguti derived the maximum for the mean largest 
value under the assumption that the distribution from which the maximum is 
taken is symmetrical. His mean value turned out to be one half of the value 
given by Plackett. 

In the following, these results will be generalized for an arbitrary (not neces- 
sarily symmetrical) continuous variate. The mean and the standard deviation 
of the largest value and the mean range will be given for two distributions: 
one where the mean largest value is a maximum, and another one where the 
mean range is a maximum. 

Obviously, a mean largest value can exist if and only if the initial mean exists. 
In addition we postulate in both cases the existence of the second moment. 


1. Initial distribution which maximizes the mean largest value. The initial 
population mean Z, mean square z, variance o*, the mean z,, mean square x2, 
variance a, and the generating func tion ( G,(t) of the largest among n independent 


observations, finally the mean range w,, for n observations are written as 
> | 
a 7 aa 2 2 =a = 
= z dF, Be [ x dF, o = 7 — Zz, 
“0 0 
1 1 al 
a =} ae 2 —j 2 = \2n 1 

(1.1) #,=n | aF" dF, =n | zF" dF, o, =n | (2 — #,) F” dF, 

“0 “0 /0 

al 


1 
G,(t) = n | FR" dF, D,=n| 2{F" — (1 — Fy") dF, 
0 


“0 
where x = 2x(F). Corresponding designations will be used for transformed 
variates. 

In order to derive the initial distribution which maximizes the mean largest 
value #, for given values of the initial mean and standard deviation, we look 
for the corresponding variate z(F), and put the first variation of 

1 


| [na ** — dy 2” — 92] dF 
/0 
with respect to x equal to zero. Here \,; and d, are constant factors which will 
take on the role of parameters. The operation leads to 
nF" — 2x — dr» = 0: 
whence 


(1.2) a = (nF™ — _)/2. 


Received 7/30/52, revised 7/23/53. 





MEAN LARGEST VALUE 77 
To obtain the maximum of #, we have to eliminate the parameters \, and 
Ae . The mean largest. value becomes from (1.1) after integration of (1.2) 
1 n° Az 
(1.3 £2 -——— so, 
2\: 2n — 1 2d, 
The initial mean obtained for n = 1 is 
The initial standard deviation becomes from (1.2) and (1.4) 


1 n—1 


c= ————~= , 
1 V/2n — 1 


Combination of the three preceding equations leads to the mean largest value 


(1.5) 


n—- 1 
(1.6) i,2f+05——. 
2n — 1 

Therefore, the mean largest value for any continuous distribution possessing the 
first two moments increases more slowly than +~/n/2 times the initial standard 
deviation. 


The initial probability function F(x, n) for which the bound (1.6) is actually 
reached is from (1.2) 


‘ 1/(n—1) 
F(z, n) = (zt) ; 
n 
The parameters \; and ), are from (1.4) and (1.5) 
on n—1 . 1 _&(n — 1) 
— oafanow’ ° oV2n — 1’ 


whence 


(1.7) F(z, n) = Se 42 


(n—1)(a—2/o 1\/e-» 
nvV2n — 1 2 : 


If we introduce the standardized variate z with zero mean and unit variance, 
(1.8) z= (4 — £)/o 


its bounds z and z, are for F(z) = 0 and F(z) = 1 


(1.9) _ «ei eT oo... 


n—1 


Therefore the domain of variation spreads with n increasing and the lower bound 
2 approaches zero in the negative domain. 
The probability function @(z, n) of the reduced variate z is from (1.7) 


ase * Bb 
/2n—1 n aaa” ~aS32zs2z 


2  n—1. 1\!o 
(4.10) &(z, n) = ( ) ’ 





E. J. GUMBEL 


Graph 1 


Probability functions which 
maximize the mean 
largest value 


Standardized variate z 


1 2 3 


For n = 2 the probability function is linear within the domain —7/3 < z < V3. 
For n increasing the initial median 


starts with zero for n = 2, is negative for n 2 3 and diminishes with increasing 
n to the lower bound z . Therefore, for z < 2 the probability function approaches 
a straight line, perpendicular to the z axis, as shown in Graph 1. 


Since, from (1.10) the variate z is 


1) V2n — 1 


(1.11) a a : 
n-—l 


’ 


the density function ¢(z, n) obtained from ¢(z) = 1/(dz)/(d®) is 
(1.12) o(z,n) = 1/(nv/2n — 18" *). 


This is uniform for n = 2. For n 2 3, the density is infinite at the lower bound 
z and decreases uniformly with increasing values of z. At the upper bound the 
value of the density function ¢(z, ,n) = 1/n+/2n — 1 does not vanish, but is 
very small. For increasing values of n, the density functions approach more and 
more parallels to the vertical axis located at z , and spread at the same time 


along the horizontal axis. The probability function (z, n) and the density func- 





MEAN LARGEST VALUE 


Graph 2 


Density functions which 
maximize the mean largest 
value 


& 
s 
= 
3 
& 
& 
i: 


tion ¢(z, n) which maximize the mean largest value are traced in Graphs | and 2. 
It is not to be wondered that a distribution obtained from a queer condition 
should show strange properties. 


2. Extremes. Consider now the extreme values of the initial distribution 
(1.10). The mean largest reduced value itself is, of course, 


‘ *e V2n — |] 


in accordance with (1.6). 

The characteristic largest value u, , which, in previous publications [3], was 
called the expected largest value defined by ®(u,) = 1 — 1/n becomes from 
(1.10), even for moderate values of n 


—— nen 
(2.2) Un ™ he - VY2n — 1. 

n— 1 
It is smaller than the mean largest value and increases asymptotically as V/2n/e 
that is, more slowly than the mean largest value. The mean square of the re- 
duced largest value is obtained from (1.1) and (1.11) as 


= _ n(2n — 1) P 


; ) an 7 
(n— 1)? . 


ZZ“. = 
. n—2 


1 
| (n'a *— 2nd" * +o") db = 5 
0 < 





80 E. J. GUMBEL 


Formula (2.1) leads to the standard deviation oc, of the largest value 
(2.3) on = nV/n/(2n — 1)(3n — 2) 


which converges to ~/n/6. Consequently, the coefficient of variation converges 
to 1/73. 

In addition to the standard deviation of the largest value, we calculate the 
mean range for the variate, which shows a maximum of the mean largest value. 
The mean range ®, for the variate z in standard units becomes from (1.1) and 
(1.11) 


mee 1 
nv 2n l ine” = ge” at ne” (1 p 6)"" + (1 a} 6)" ) d®. 


D, = 
n— 1 “0 


Therefore, the mean range is obtained after trivial calculations as 
n? (n — 1)? 
) ~ "Gn lVm— i (2n — 2)! 
These values are traced in graph (3) and converge to 


(2.4’) D, = Vn/2. 


Thus w, converges from above toward z, . This is not to be wondered at, since 
z converges toward zero, and, therefore, z, constitutes a larger and larger part 
of the range. It follows that the quotients o,/w, converge toward the same value 
1/-/3 as the coefficient of variation. Finally, the quotient w,/z, converges 
towards unity. 


The probability function ®,(z) of the largest value defined by 
(2.5) ®,(z) = &"(z,n) 
becomes from (1.10) and (2.1) 
(2.6) ®,(z) = [(1 + 2-2n)/n]™"”. 


The probability ®,(2,) at the mean largest value converges toward 4. The density 
function ¢,(z) of the largest value obtained from (2.6) and (1.12) 


(2.7) ¢n(z) = &(z,n)z,/(n — 1) 


is equal to the initial probability function reduced by a factor. It increases with 
z up to z,. Hence no mode exists. The median largest value 2, obtained from 
(2.6) and (2.1) 


(2.8) 4, = (n2/*4 — 4) V2n — 


n—-1- 


is smaller than the mean largest value, converges toward it for n increasing and 
increases asymptotically as ~/n/2. The probability of the largest value becomes 
at the characteristic largest value u, from (2.2) and (2.6) 


(2.9) $,(u,) we, 





MEAN LARGEST VALUE 


and w, /6 


Zn 


mean largest reduced values 
Z,and mean reduced ra 


W,,/6 as functions of n 


a) Upper bound for @, (4.7) 

b) Upper bound for (z,—#,)/s (3.6) 

c) ®, for distribution (1.11) which maximizes 2, (2.4) 
d) Upper bound for 2, (2.1) 

e) 2, for exponential distribution (3.2) 

f) 2, for double exponential distribution (3.4) 

g) Upper bound 2, for symmetrical distribution (4.7) 


This probability converges toward 1/e which is (cf. [3]), the value of the first 
asymptotic probability of extremes, at the characteristic largest value. This is 
an analogy between the probability function of the largest value (2.6) and the 
asymptotic probability functions of largest values obtained by Fréchet [1], 
Fisher and Tippett [2]. If we introduce a reduced largest value 


(2.10) y = (1 + 2-2,)/n = [1 + (x — £)(& — 2)/o'|/n, 


the probability (2.6) of the largest value converges to the simple expression 
(2.11) O(y) = y; Osysl. 

The rectangular distribution is thus the asymptotic distribution of the extreme 
value of the variate z, given by (2.6). This case cannot be derived from the sta- 
bility postulate used by Fréchet, Fisher and Tippett to construct the three 
asymptotic distributions of extreme values. Inversely, the asymptotic probability 
function (2.11) is not stable in the sense used by these authors. 


3. Comparisons. It is interesting to compare the maxima (2.1) of the mean 
largest value to the mean largest values obtained from actual asymmetrical 
distributions. To this end, consider first the exponential distribution 


(3.1) f(z) = 1 — F(z) =e”. 





82 E. J. GUMBEL 
The generating function (1.1) of the largest value is 


1 
Galt) = | n(1 — FF a; 
0 
The right side becomes 


Tin + 1)P(1 — }/Tian+1—2 = 1/[[ hi (1 — t/v). 


The usual procedure leads to the mean largest value Z, = >} 1/v. The mean 
largest reduced value 


(3.2) z= Do: 1/» 
converges for targe n to 


Zn ~y + Ign 
where 7 is Euler’s number. Some numerical values of (3.2) taken from Karl 
Pearson’s table [5] are traced in graph 3. 


As a further example, take the first asymptotic distribution of the largest 
value {3}. 


(3.3) F(x) = exp[—e™), 


which is closely related to the exponential function. The initial standard devia- 
tion o is r/+/6 and the mean largest value is 


Z, = E+ Ign. 


The standardized mean largest value 
(3.4) 2, = (Ig n)V6/" 


is also traced in graph 3. For small samples, the mean largest values taken from 
the exponential and double exponential distributions (3.1) and (3.3) are only 
slightly short of the maxima for any asymmetrical distribution. 

The factors and o in the Theorem (1.6) and the mean largest value Z, are 
population values. We now establish a similar theorem’ which deals with sample 
values. Let z, stand for the observed largest value, % for the sample mean and 
s for the sample standard deviation calculated with the usual correction. Then 
(tn — %) re (x, — &)° where the equality holds if, and only if, all obser- 
vations z, are the same. It follows immediately that 


= \2 
(3.5) (% — 4) <9 
n-—1 


or 


(3.6) In S fo + 8vV/n — 1. 


Formula 3.5 may be used to control the calculation of the sample standard 
deviation. 


! This simplification of a different proof was suggested by the referee of this paper. The 
author is also obliged to Mr. W. Hoeffding for pointing out a slight inaccuracy in the orig- 
inal draft. 





MEAN LARGEST VALUE 83 


4. The maximum of the mean range. Plackett [6] has calculated the maximum 
of the mean range as function of the sample size and constructed the initial 
distribution where the maximum is actually reached. In the following it will 
be shown that this maximum holds for any continuous variate possessing the 
first two moments. 

The procedure of Section 1 is now used for the mean range @, given in (1.1). 
To find the unknown probability function F,; which maximizes ®, for constant 
values of the initial mean and standard deviation the first variation of 


| [(nx{F7* — (1 — F.)""} — kya” — kez] dP 
Jo 


with respect to z is to put equal to zero. The operation leads to 
(4.1) n{Fz"' — (1 — F;)""} — 2hz — kh, = 0 
whence 

(4.2) r= 
The mean is from (1.1) 
(4.3) & = —k,/2k, . 
The variance o° is obtained from (1.1) as 


2n2(1 — én) 
~ 4k3(2n — 1) 


n{Fr' — (1 — F:)""} — ke 
2k, i 


(4.4) o 


where the factor 
(4.5) e, = (n — 1)!*/(2n — 2)! 


vanishes as 2~°". The standardized variate z defined in (1.8) becomes from 
(4.2) and (4.3) 


ab. 2n — 1 ~~? = n-l 
(46) z= f pat re (FP: (1 — F2)""}. 


It follows from (1.1) that the mean range is 


a rig. 2(1 — €n) 
(4.7) v.,=7 4 B=) — 


which is Plackett’s formula for the maximum of the mean range. From (4.6) 
it follows that the variate z for which this maximum is reached has a symmetrical 
distribution. Therefore, the mean of the largest value which is one-half of the 
value given by (4.7) is the maximum which this mean can reach for any sym- 
metrical distribution. This result was derived by Moriguti [4] from a different 
approach. 

It remains to calculate the standard deviation of the largest value for the 
distribution (4.6). From (1.1) it follows that the mean square of the largest 
value is 





E. J. GUMBEL 


= _ n(2n — 1)(1 — ,) 
#. " BGn — 2) — en) 
where 


(4.8) 6, = (2n — 2)!(n — 1)!/(3n — 3)! 


converges to zero as 2°"3~°". Since from (4.7) Z,° = n’(1 — «,)/2(2n — 1), the 
standard deviation of the largest value for z approaches 


(4.9) on™ (n — 1)V n/[2(2n — 1)(3n — 2)] 


and for large values of n 


(4.9’) on ~} ; 


The coefficient of variation ¢,/Z, of the largest value approaches 


(4.10) On/in ~ 1/8. 
Of course, the quotient ¢,/@, approaches one-half this amount. 


5. Conclusions. It is interesting to compare the asymptotic properties of the 
reduced values for the two distributions (1.10) and (4.6) which lead to the 
maximum of the mean largest value and to the maximum of the mean range 
respectively. The maximum of the mean largest value for an asymmetrical 
distribution increases as ~/n/2, while for symmetrical distributions it increases 
as \/n/2. The maximum of the mean largest value for an asymmetrical distribu- 
tion is 41 per cent larger than for a symmetrical one. Inversely, the asymmetrical 
distribution for which the mean largest value is a maximum yields a mean range 
which is only 71 per cent of the maximum of the mean range. However, the 
standard deviation of the largest value for the distribution with maximum mean 
range is only 71 per cent of the corresponding value for the distribution with 
maximum mean largest value. The coefficient of variation of the extreme value 
for the two distributions is the same. The quotients @,/c, and ®,/z, for the dis- 
tribution with maximum mean largest value are one-half of the values for the 
other distribution. 

REFERENCES 

[1] M. Frécuer, ‘Sur la loi de probabilité de l’écart maximum,” Ann. Soc. Polon. Math., 
Vol. 6 (1927), pp. 93-122. 

[2] R. A. Fisner ano L. H. C. Trprert, “Limiting forms of the frequency distribution of 
the smallest and the largest member of a sample,’’ Proc. Cambridge Philos. 
Soc., Vol. 24 (1928), pp. 180-190. 

(3) E. J. Gumpen, “The return period of flood flows,’’ Ann. Math. Stat., Vol. 12 (1941), 
pp. 163-190. 

[4] Staert1 Moriauti, ‘Extremal properties of extreme value distributions,’’ Ann. Math. 
Stat., Vol. 22 (1951), pp. 523-536. 

[5] K. Pearson anp M. V. Pearson, ‘On the mean character and variance of a ranked 
individual, and on the mean and variance of the intervals between ranked in- 
dividuals, II,’’ Biometrika, Vol. 24 (1932), pp. 203-279. 

(6) R. L. Puackert, ‘‘Limits of the ratio of mean range to standard deviation,’’ Biometrika, 
Vol. 34 (1947), pp. 120-122. 





UNIVERSAL BOUNDS FOR MEAN RANGE AND 
EXTREME OBSERVATION 


By H. O. Hartritey anp H. A. Davin 


Iowa State College, C. S. I. R. O., Sydney, Australia and University 
College, London 


1. Summary. Consider any distribution f(z) with standard deviation ¢ and 
let 2; , 2 +++ £, denote the order statistics in a sample of size n from f(x). Further 
let Ww, = 2, — 2, denote the sample range. Universal upper and lower bounds 
are derived for the ratio E(w,)/o for any f(x) for which ae S z S be, where a 
and b are given constants. Universal upper bounds are given for E(x,)/o for 
the case — © <x < «, The upper bounds are obtained by adopting procedures 
of the calculus of variation on lines similar to those used by Plackett [3] and 
Moriguti [4]. The lower bounds are attained by singular distributions and require 
the use of special arguments. 


2. Introduction. The use of range or mean range in the estimation of o has 
received considerable attention in industrial quality control as well as more 
recently in techniques of short-cut analysis of variance. Like many alternative 
methods of estimation the procedure assumes that the basic distribution‘of the 
data is normal. In this case the relation E(w,) = d,o holds, where d, is a well 
known constant, first tabulated by Tippett [1]. Thus an observed range w, can 
be converted into an unbiased estimator of ¢ by ¢ = w,/d, . 

It is of interest to consider to what extent this estimator is biased if the basic 
distribution is not normal. Now the general formula for E(w,) in any population 
with cumulative distribution function’ P(x) is given (see [1]) by 


al 
(1) E(w.) =n | 2(P)[P"! — (1 — P| aP. 


“0 


E. S. Pearson (see e.g. [2]) has studied empirically the effect of nonnormality in 
P(x) on E(w,)/o for a variety of Pearson Type distributions, and found this 
ratio to be very stable. Taken in conjunction with (1) this suggests the possi- 
bility of establishing universal upper and lower bounds for E(w,)/o on lines 
similar to the well known Tchebycheff inequalities for moments. This problem 
has already been considered by Plackett [3] and, in greater detail, by Moriguti 
[4] for the equivalent case of the extreme value in a symmetrical population. 


Received 1/27/53, revised 9/30/53. 
1 Although this is unnecessarily restrictive it is convenient to assume that P(z) is piece- 
wise differentiable. 


85 





86 H. O. HARTLEY AND H. A. DAVID 


Upper bounds have been tabulated by these two authors and are shown to be 
attained for the c.d.f. P(x) of which the inverse function z(P) is given by 
(2n — 1)! id * 
a(P) = ae [P™' — (1 — P)*". 
(2) l2}1— 


4 2n — 2 
ides ochemie 


However, there is no lower bound (other than zero) since E(w,)/o can be made 
arbitrarily small for certain universes. 

From the point of view of practical applications this last defect is particularly 
disappointing as we often have to deal with data which are clearly not normal 
whilst no definite alternate distribution is known. In such situations one is, 
however, often able to assume that the data may be graduated by a finite range 
distribution (not extending beyond the range ao S zx S be) without making any 
further assumptions as to the shape of the distribution. 

It is remarkable that if the above very wide assumption is made about P(z) 
it is, in fact, possible to derive lower bounds for E(w,)/o which are well above 
zero and in certain cases fairly near the upper bounds. 

To fix the ideas we assume that E(x) = 0 and « = | and then proceed to 
consider distributions for which it is known that —X s x Ss X. If the range of 
risa = x &S b, that is, is asymmetrically placed about the mean or origin, X has 
to be taken as max (—a, b). In this case the upper and lower bounds are not 
necessarily attained. 

Before discussing the probiem arising when z is restricted we shall derive the 
maximum of E(z,) in the unrestrained case, abandoning the condition of sym- 
metry in the parental population imposed by Moriguti. An upper bound for 
E(xm) (m = 1, 2, +--+, n) will also be given. 


3. Upper bound of the expectation of z,,. We consider first the extreme r, . 
i 
(3) E(x.) =n [ 2(P)P™ aP. 
“0 


This is to be maximized for functions z(P) subject to 


1 1 
(4) | dP =0 and | dP = 1. 


“0 


From the calculus of variations we find the stationary solution 
1/(m—1) ale 5] 
(5) P(x) = (=+") ~ =) 525 (an - 1) 
n n— 1 
where a = (n — 1)/(2n — 1)', and 
n(2n — 
n— 


5 an 
(6) E(z,) = —~ | P*"(nP*"' — 1) dP = (n — 1)/(2n — 1)? 


It will now be shown that (6) gives in fact the upper bound of E(z,) and that 
this is attained for the distribution of (5). 


* The details of this derivation are omitted since they are given in the preceding paper. 





MEAN RANGE 


By Schwarz’s inequality we have 
n=—1 = < j n=! 2 in 
[ np (2+ ap s\([ (mp ap) [ r++) ap)\. 


Hence 
’ { x 1 j 
Ma) +58) —4(It+e 


(2n — 1)! n 


E(z.) + => # (n — 1)(2n — 1)! 


that is, 
(6’) E(zx) S (n — 1)/(2n — 1)', 


equality occurring for (5). This upper bound of E(z,) is tabulated in Table 1 
and may be compared with Moriguti’s upper bound applicable to symmetrical 
populations only. 


Precisely as above it can be shown that 


' B(Qm — 1, 2n — 2m + 1) ‘ 
| Be) | 5 (Ba er -)): 


TABLE 1 


Sample Upper bound of E(z,) U . 
: Vpper bound of E(x») 
ocee ate on a (any population) 


0.5774 0.5774 
0.8660 0.8944 
-0420 1.1339 
.1701 1.3333 
2767 1.5076 
3721 1.6641 
4604 1.8074 
5434 1.9403 
.6222 2.0647 
6974 2.1822 
- 7693 2.2937 
8385 2.4000 
-9052 2.5019 
- 9696 2.5997 
-0320 2.6941 
.0926 2.7852 
1514 2.8735 
. 2087 9592 
. 2645 3.0424 


WNWWNNN KK RRR RR Ree ee 





88 H. O. HARTLEY AND H. A. DAVID 


However, this upper bound can be attained by a probability distribution only 
if m = n (or m = 1), as the stationary solution is 


1 
“ Bin, n—m+41) 


z 


pP™(1 — Pp)" — 1 


and this expression for z is monotonic only if m = n orm = 1. 


4. Upper bound of E(w,) for ~X S x S X. When introducing the finite 
variate range —X Ss x S X we stated that this restriction raised the lower 
bound of E(w,). The restraint may, however, also cause a reduction of the “‘free’’ 
upper bound of W, = E(w,) found by Plackett and Moriguti. Since Moriguti 
confines himself to finding the upper bound for symmetrical distributions we 
first show that his solution applies generally and provides the maximum ratio 
of mean range to standard deviation in the competitor class of unrestrained z(P). 
To show this let us start with a general z(P) and write y(p) = x(p + 4) which 
we split into a symmetrical and anti-symmetrical part by setting y(p) = o(p) + 
e(p) where 0(—p) = —o(p) and e(—p) = e(p). Now (p + 3)” — (4 — p)””* 
is clearly odd in p. Hence W,, is unchanged if y(p) is replaced by o(p), but 


+4 
#to(p)) = | _ oD) dp 


+4 +4 +4 
2 a 2 2 as 
[, y (p) dp [, ep) dp s | y'(p) dp 1. 


Hence W,,/c is increased by the removal of e(p). Finally “scaling up” o(p), that 
is, introducing co(p) so as to satisfy o°{co(p)} = 1 we have 


W,{co(p)} = cW, {o(p)} 2 Waly(p)} 


which shows that, in finding the maximum of W given ¢ = 1 we may confine 
ourselves to symmetrical distributions, that is, odd y(p) = 2z(4 + p). This 
maximum is attained for the finite range distribution (2) for which we have 


(2n — 1)! 
(7) | x | Ss { — Vien = a\ 1)" = X, 
fol 1 -1/(™~ ) 
\@ { 
a j n— 1/}) 
say. It follows that ‘f we seek a maximum under the restraint that | 2(P)| < X, 
the solution is still given by (2) provided X = X, while for X < X, the restrained 
maximum will be reduced. The critical quantity X, is tabulated below: 


n{| 2] 3] 4] 8] 7| 8 | 9| 10 
X, | 1.782| 1.732] 1.919 2.187 2.350) 2.551) 2.739) 2.916) 3.082 
| | | 
n u- | 12 m i £6. | Snes 1 fe 19 20 


X, | 3.240 3.391 3.535, 3.674 3.808 3.937) 4.062) 4.183) 4.301) 4.416 





MEAN RANGE 89 
We proceed to find the maximum when | 2(P)| s X < X, . The solution 2(P) 
(say) suggested by the calculus of variation is now 

a,~(P) =n(P*'—Q"") forPig PSi-—P, 
(8) u(P) = —X for P Ss P, 

a(P) = X forl — Pi s P. 

where P; is related to the constant a, > 0 by 
(9) a,X = n(—Py' + (1 — P,)*”). 


a, is determined to satisfy o’(x) = 1, and Q = 1 — P. To prove that this solu- 
tion xo(P) does in fact yield a maximum we denote by 2;(P) any other competitor 
function satisfying the conditions o°(z;) = 1 and | 2,| < X and write with ob- 


vious notation: 

1 1 
(Wala) — Walzo)] = [ (ar — 1)(P — GY) aP = Lax [ (ct — 28) aP 
n 0 2n 0 


| (11 — n)(P™ -Q- = 2 2») dP — — af (a; — 2)" dP 
l (c, + X) (p= - 4+ 2X) dP 


1 . 
+f @-»x (p -g'- ox) ap — 3 [ (t: — a)* aP 
1—P n 
in virtue of (8). 
But clearly 


p™" — Q™? + ayX/n S Oand 2, = —X for P s P, 
and 


p™* — Q™" — axX/n = Oand x, Ss X for P 21—P, 
so that w,(x21) < w,(xz). To evaluate the maximum of W, when X < X, we note 
from (1), (8) and (9) 


1—P} 
W, = nf = (p™' — Q™")' dP + 2X(1 — P? — (1 — P,)") 
Py x 


( 
(10) = 2X4 (1 = PI = PP = rv (nyn) — 1) /(7" 7 i} 
Ee Teer 
+ 2X(1 — (1 — Pi)” = PP). 


Here P, is given by the condition 


1 
(11) [ 2icp) ap = 3 
4 





90 H. O. HARTLEY AND H. A. DAVID 


that is, from the equation 
x? 
(je - 1K Py = Py 


{(] — Py" — Pi — (2h-», (n,n) — »/(? 9) + P,X* = }. 
) 


From (10) we therefore obtain as the upper bound 
(13) W, = n(1 — 2PiX*)((1 — Pi)" — Pf )/X + 2X(1 — (1 — P,)" — Pf) 
where P, is the root of (12). 


5. Lower bound of E(w.) for —X s x s X. Reverting to the probability 
integral P(x) in place of its inverse z(P) we turn now to the problem of mini- 
mizing 


(14) WIP} = [a ~ P* — Q") de 


subject to 


(i5) p= X — 7" P(x) dx = 0 
x 


(16) o = X*’-—2 [ P(x) dx = 1 
and 


(17) -Xs7sX. 


Without loss of generality we may confine ourselves to step-functions of 
(say) m “internal” steps, namely 


(P(x) = () ~ 
(18) { P(x) = P; ><a < Li+t 
| P(z) = | 22 


Tm+1 


where 0 < P; < «+: < P,, < 1; for by the Euler-Maclaurin theorem we can’ 
with any accuracy desired, approximate to the integrals (14), (15) and (16) by 
step-functions, provided m is taken sufficiently large. Hence the lower bound of 
W given by (14) may be determined for the step-functions (18). 

Lemma A. For any m-step-function P(x) (see (18)) with m 2 3, satisfying 
(15), (16) and (17), a new step-function P*(x) also satisfying (15), (16) and 
(17) can be found for which the number of steps is reduced by at least one and 
W{P*} s W{P}. 


Proor. Keeping the x; unchanged we define P* as follows: 
(P* =P;+ A? 


|P* = P; 





MEAN RANGE 91 


where the A? will be determined in the form A? = pA, with the common scale- 
factor, p, being subsequently chosen. 
In order that P* should satisfy (15) and (16) we have 


x Ariss - zi) = 0 


(20) a 
1D Adeiys — 22) = 0. 


\ tol 


Condition (17) is automatically satisfied. 
From (18) we have 


(21) W{P} = Soh (1 — PP — QP) (ain — 28) 
and hence, using Taylor’s expansion up to the second-order term, that 
W{P*} — WIP} = —n Din (PP — OP" \tinn — 2d? 
—3n(n — 1) Dob ((Pi + AT)" + (Qi — BAT)” eign — 2 AP 
where 0 < 3; < 1. Consider now the 3 X 3 matrix 


(22) 


Vi4n — Vy 


2 2 as 
Me=| igi — XH with columns 7 = 1, 2,3. 


(PP — QF )(xunn — 20 | 


Let r denote the rank of M. We distinguish two cases: (a) r < 3. In this case 
we can satisfy the equations (20) as well as 


(23) n> vias (Po — QP" )(tigns — 2A; = 0 


by a set of A; with >i > 0. Clearly the set pA; also satisfies (20) and (23). For 
sufficiently small p we obviously have 


(24) 0 < Py + pA; <--> < P3 + pas < Pa < ++? < Pe <1. 
Now if we increase p continuously, a point is reached, when one (or more) of the 
inequality signs of (24) is changed into an equality sign. Taking this value for 
p we have correspondingly a P* of at most (m — 1) steps. Further, from (22) 
and (23) 
(25) W{P*} s W{P}. 
(b) r = 3. This allows us to satisfy (20) and 

n> iat (P? ; — Q? \ (tia — z,)A; = ] 


with a set of A, , not all zero, so that the set AT = pA,(p > 0) satisfies (20) and 


(26) n> ie(P3 — QP") (tian — 2 AT = p. 





92 H. O. HARTLEY AND H. A. DAVID 


We again choose p as in (a) to obtain P* and in virtue of (22) and (26) reach the 
required result (25). 

So far we have shown that the minimum of W(P) can be determined by re- 
stricting P(x) to step-functions of at most m = 2 internal steps. Writing 


(27) ~p = Py, Po = P, — Pi, Ps = | — P, 


we now prove a further lemma. 
Lemma B. W cannot attain its minimum for a set of x; , pi (i = 1, 2, 3) satisfying 


(28) —X <m<m<y<X 
and 
(29) pi > 0. 

Proor. In terms of p; conditions (15) and (16) become 
(30) w= Din pa = 0 
(31) om Din pati = 1. 
Keeping p; constant we alter x; to rt by 


(32) ay = 2; + pAx. 


The Az; will be determined to satisfy, in the first instance, the conditions 


(33) u{P*} = 0, 0° {P*} > Land W{P*} = W{P} 


namely, 


re = 0 
(33’) | D pivsAay = § 
\(l — Pf — QP)(4x2 — Ans) + (1 — PP — Q?)(Axs — Azz) = 0 


where 6 will be specified forthwith. If for the rank r of the matrix of equations 
(33’) we have r < 3 we solve for 6 = 0, if r = 3 we solve for 6 = 1. In either 
case >, Az; > 0. For sufficiently small p we have, by (28) 


—X < 2% + pAx, < % + pAxe < 23 + pAx; <¢ P 4 


and as before choose the smallest p > 0 which converts at least one of the in- 
equalities into an equality. 
Since 
a {P*} = 0° {P} + 295 + p Dpidzi 
and all p; > 0 we have o'{ P*} > o{P} = 1 so that conditions (33) are satisfied. 
We now need merely introduce a new distribution P’ which takes the discrete 
frequencies p; at the x— values 


, * 
a, = 2; /0{P*} 





MEAN RANGE 


and see at once that 
uf P’} = 0, o{P’} = 1, and W{P’} < W{P} 
which proves Lemma B. 


Completion of the solution. From Lemma B it follows that the minimum can 
occur only for the following sets of p; and z;: 


(a) One of the p; is zero. This corresponds to a two-point distribution and 
includes the cases 7; = 22 Or X2 = 2. 

(b) pi > 0 (i = 1, 2, 3) 
and 


—X <2%<%2<% 2X 


—X =m <4 < 2% < X. 
pi > 0 
and 
—-X = 1 < 22% = _— 
We proceed to rule out (b) and (c). First consider (b) where we may confine 
ourselves to (i). 

Suppose that the minimum did occur for a set of 2; , p; satisfying (b), then 
clearly this set would have to satisfy the necessary conditions resulting from 
Lagrange’s method of undetermined multipliers for the “free” variables 2; , 
2, Pi, P2, Ps With side conditions o* = 1, » = 0, )>p; = 1. It will be shown 
that a set satisfying these equations can not provide a minimum. We may write 


W(P) as 
(34) W = Aj(xg — 2%) + Aa(ts — 22) 


(where A; = 1 — pj — (1 — p,)"j = 1, 3) and the variables are subject to the 
side conditions 


(35) —pa=1, Lpa=0, Lipszi=i 


and 


(36) 0<p< il, —-X <m<nm<a =X, 


It follows from partial differentiation with respect to 2, 22, ~Pi, P2, ps that 
(37) apt; + Bpi — Ai = 

(38) APot, + Bp. + Ai — Az 

(39) hari + Bx, + 7 + (a2 — mA} 

(40) har, + Bx, + ¥ 

(41) daxs + Bry + 7 + (xs — )As 





94 H. O. HARTLEY AND H. A. DAVID 


a, 8, y being Lagrangean multipliers. From (39) minus (40) we find 
(42) a(x) + t2) + B — Ai = 0, 
from (40) minus (41) we find 

(43) ha(x, + 2) + 8 + A; = 0, 

from (42) minus (43) we find 

(44) a(x; — 23) = Ay + Ai, 

from (37) and (38) we find 

(45) a(x; — X2) PrP, = P2Ai — pi(Az — A), 
from (37) and (38) we find 

(46) ha(x; + 22) + B = 4[Ai/pi — (Ai — As)/pal, 
from (44 and (45) we find 


(47) a pilAy — As) + mA 


‘ 


= 
ty 2 2p: pA, + A5) 
say, from (42) and (46) we find 


= Q, 


(48) (Ay — Ag)/p2 = (Ai — 2psA3)/pr. 


Now, if qj = 1 — py, 


A+ As =nqi'—-pri+q —p: ) 


= n{(p2 + p:)" me ps + (pi + ~»)"" _ pr] > 0. 


Hence from (44) we have a < 0. 


The matrix of second-order partial differential coefficients corresponding to 
equations (37) to (41) is 


| 


(aa — 2:)Ay 0 az; + 8 — Ay A, || 
0 0 0 0 are + B |! 
(49) 0 0 (a3 — 29)A3 0 —A; 
at; + B — A; 0 0 ap, 0 
A; ar, + B —A; 0 ap, 


It is sufficient to show that for certain 2; , 22 , pi , P2 , Ps Satisfying the side-condi- 
tions (35) and (36) there results a value of W smaller than that computed from 
the above stationary solution. We keep x, constant and vary 2; by an amount 
Ar, . The second-order term of the Taylor expansion in the neighbourhood of 
stationary W is 


I = (Xe -_ a)Aq (Ap,)” + (r3 — t2)A3(Aps)” + ap;(Ax)° 
+ 2(ar; + B — Aj) ApiAxy. 


(50) 





MEAN RANGE 


Now from (35) we have 
(pr. = (1 + 2exs)/[(22 — 21)(a» — 21) 
(51) Pp: = (1 + xgx1)/[(es — 22)(t1 — %)] 
= (1 + 2 22)/[(t11 — )(x2 — %,)]. 
It follows that 
o- Opi l 
52 ——- 
(5 ) Oz, Pt (- -— Fi 
and 
(53) Ops —— (2 — ») i PQ 
Oz Z3 —~ Fez \FZ3 — Zi Ys — Le 


The last term of (50) can be rewritten from (37), (48) and (52) as 


2( 4 - Ai) Ap An = (4: + A — 41) Ap: Ax 
Pr Pi P2 


= & ears ) a ae ) p(an)* 
Pr P2 %— MN tz NM 


neglecting differentials of higher than the second order. 
In this way it can be shown that J < 0 if 


(54) | Ar | (1 + Q)’/Q* + | A | Q/(1 — Q) > 2(Ai + A3)/pr. 


This inequality, a sufficient condition for ruling out the stationary solution as a 
minimum, is easily proved for n = 2 or 3. For in that case we have A = npq and 
hence from (47) 


Q = [Ai/pr + (Ai — As)/pal/[2(A1 + As) 
= nq + ri — ps)/[2n(1 — 2p, + 1 — 2ps)j = (L — ps)/(4pr). 
Also from (48) 
n(p~pi — pr) = n(l — pr) — 2n(l — 2p.) 
that is, 
271 + Ps = 


so that pi = po = p (say), pp = 1 — 2pandQ = }. 
Thus in (54) 


L.H.S. = 2n(q + 1) > 4n = R.HS. 


For n > 3 it became necessary to establish by numerical evaluation that the 
value of W computed from the stationery solution is always larger than that 
computed from the two-point distribution of (a). The procedure used was as 
follows. 





96 H. O. HARTLEY AND H. A. DAVID 


A representative range of values of p; , was chosen, and the corresponding p,; 
calculated from (48), which with p, = 1 — p; — p; is a relation between p, 
and p;. Next, Q was found from (47) and then W and X, which could now be 
determined by (35); for 


W/(as — 2;) = AQ + A,(1 — Q) 
o/(t3 — a1) = 1/(a3 — 11) = [poQ’ + ps — (p.Q + p,)’} 
whence 
W = [AiQ + Aa(1 — Q)lpxQ° + ps — (PQ + ps)T. 
Also 
X = (1 — ps — pQ)ip’ + ps — (p2Q + ps)'T". 


In this way corresponding values of X and W were built up, and it was easily 
seen that the value of W obtained from this stationary solution lay well above 
that calculated from the two-point solution (a) which is further discussed below. 
Attention was focused on the range n = 4 to 20, but the tables so obtained in- 
dicated that the two-point solution leads to the smaller W for all X and n. 

}t remains to eliminate case (c). This follows readily from the above approach. 
Thus, setting up equations of the type (37) to (41) we reach analogously to (48) 


(55) Ai(1 — 2p;) — A3(1 — 2ps) + 2(A; — A;) = 0 
or F(p,) — F(ps) = 0, where F(p) = A’(1 — 2p) + 2A. But F’(p) = 


A” (1 — 2p) < 0 for 0 < p < } and clearly 0 < p,, ps < 4. Hence equation 
(55) can be satisfied only if p,; = ps; = p, say. This makes z, = 0 andin this case 
the second-order term of the Taylor expansion corresponding to (50) is easily 
shown to be negative. Thus the only stationary solution possible in case (c) is in 
fact a maximum. 


Properties of the Solution. We may therefore evaluate the minimum under 
condition (a), that is we confine ourselves to two point distributions with proba- 
bilities p and g at x, < 0 and z, > 0 respectively. Without loss of generality we 
assume —2x, < 2 . Instead of finding the minimum of W/c under the condition 
ty & X we may determine the minimum for given z, and then consider it over 
the range 1 S a S X. But for any fired x, the conditions p + q = 1, 
px, + qr = 0, pri + qxi = 1 determine p, g and <x; uniquely, in fact 


(56) p = «2/(1 + 23) 
and, with this value of p, the mean range W is given by 
(57) W = (1 — p” — q")/V pg. 


Thus, introducing the functions G,(p) = (1 — p” — q")/V pq and p (x2) = 
x3/(1 + 2x3), the mean range W is given by W = G,,(p(a2)). In order to obtain 
the lower bound for the mean range E(w,) we must determine the minimum of 





MEAN RANGE 97 


G,,(p(x2)) over the range | S 2, S X. Now it is shown below that the minimum 

value of G,(p(x2)) must occur at one of the end points of the 2.— range that is, 

either for z. = 1 or for 2.2 = X so that we have the final result 

; _ (G.(p(1)) = 21 — 4)"”) 

(58) W lower bound = MIN ‘i -_ 
\G.(p(X)) = (1 — p" — q")/V pq 

where p = X°/(1 + X*) andq = 1 — p. 

This is shown in Table 2 for n = 2 (2) 20. 


In the case of variate range a S x S b we may still use the lower bound (58) 
with X = max(|a|, || ). Since for the two point solution 7, = —1/2z,, this 
involves no loss in the sharpness of the lower bound obtained unless 
min(|a|,|b|) < 1/X. However, this situation of extreme skewness is clearly 
rare and it does not seem worth-while to consider it further. 

It remains to show that for the range 1 S a S X, the function G,(p(22)) 
takes its minima at either x. = 1 or at rz. = X. Since p(z2) is monotonic it suffices 


TABLE 2 
Table of the lower bound of E(w,) given that —-X S x SX 


n=2 6 8 10 12 


1.000 19 .938* .984 1.996 .999 

.800 Als .844* .984t 1.996 .999 

. 600 ‘ ) . 562 .898t 1.996} .999 
485 Ol . 296 .633 1.932f .999§ 
.392 de .090 .400 1.687 .952§ 
* 1.938 is to be used for S 1.20 for larger X interpolate in Table : 
+ 1.984 is to be used for S 2.64 for larger X interpolate in Table ‘ 
t 1.996 is to be used for X S 3.76 for larger X interpolate in Table ‘ 
§ 1.999 is to be used for 1 S X S 4.82 for larger X interpolate in Table ‘ 


A HA WA 


X? - 
ee) Ae | =12 ) ‘ 
p Xa 1| \ n=1 14 16 18 20 
95 | 4.36 
96 4.90 | 


1.999* .000 : 2.000 2.000 
' 
97 5.69 I 
I 
I 


2 
976* 2.000 2.000 2.000 
795 2 .000T 2.000 2.000 2.000 
Ba 2 
& 


98 7.00 
.99 9.95 


538 760T .973t .000§ 2.000** 
. 142 320 .494 1.664§ .831** 


* 1.999 is to be used for 1 S X S 4.82 for larger X interpolate in Table ‘ 
t 2.000 is to be used for 1 Ss X sS 5.84 for larger X interpolate in Table ‘ 
t 2.000 is to be used for 1 S X S 6.86 for larger X interpolate in Table ‘ 
§ 2.000 is to be used for 1 S X S 7.89 for larger X interpolate in Table ‘ 
** 2.000 is to be used for 1 S X S 8.90 for larger X interpolate in Table ‘ 





98 H. O. HARTLEY AND H. A. DAVID 


to show that G,(p) can not have a local minimum in the range } < p < 1. Sup- 
pose, then, that G,(p) had a local minimum at p = p,, say, with } < p, < 1. 
Then the necessary conditions G.(px) = ( and Gp) = 0 would have to be 
satisfied where the dash denotes differentiation with regard to p. Introducing the 
function A(p) = 1 — p" — q” we have with q = 1 — p; 

(59) 0 = G,.(p1) = {pmA’ — 3(m — pi)A}/(pig)” 


n 


where A and its derivative A’(p) = —n(p"" — gq") are taken at argument 
pi. From (59) we obtain 


(60) (qi — pA = 2pmA’ 


and it is immediately clear that for n = 2 (1) 5 equation (60) can not be satisfied 


for any p; between 4 and 1. For when substituting the expressions for A and A’ 


we find 
(61) L— po — qi = 2npiqilp . n 'y/ (pr — qq) = 2npi >it Pi ‘at ' 
or 


n 


1 \ n—1 
(62) —pM-KT— >. (")ni a = 3. (2n -- ()) Pig. 


i=l i=] 


The left-hand side of (62) isO while the right-hand side is positive since for n S 5 


n\. n 
we have 2n = (’ ) for all 7 and for some 7 we have 2n > ( a 
1 


1 
Confining ourselves then to x = 6 we obtain from the conditions (60) and 
G".(p) > ( the inequality 


(63) 2(p1 — m)pimA” — A’ 2 0 


or substituting A’ = —n(p"' — q"") and A” = —n(n — 1)(p"” 


we obtain 


(4) n(n 1)2(p. — m)pg(pr” + qt’) + n(pr' — qr) 20. 


Condition (64) can clearly not be satisfied if 2(n — 1)(m —- q)m > 1 
‘1 — 241 > Rn 1). But this inequality holds for the range q’ S q 
/ 1 | — 
here g’ = \% - / _ and q” = 4 - / a —- are 
=e >. ea oO * hs aay 
tabled below: 
6 7 8 10 | 12 | 14 | 16 | 18 | 20 
138 | .106 | .086 |) .064 | .051 | .042 | .036 | .031 | .028 
.362 | .394 414 .436 | .449 | .458 | .464 | .469 | .472 


We may therefore confine ourselves to the ranges 0 <q S q andq” S qm < 3, 


and will show that in these ranges equation (61) cannot be satisfied for n 2 7 





MEAN RANGE 99 


Dividing the left- and right-hand sides of (61) by ng and introducing the fune- 
tions 


L(q) = (1 — p" — q")/ngq, R(q) = 2pdota'p* ‘g" 


we have 
L(q) s(l-(1- q)")/nq =j— A(n o- 18 ox q)" 4 


where @ is a mean value from the quadratic remainder term in the Taylor ex- 
pansion of (1 — q)" at q = 0 and see that L(q) < 1. 

On the other hand R(q) = 2p(p"* + p"*“q) = R*(q), say. Since forn = 7 
and q S q’, (dR*(q))/(dq) < 0, R*(q) attains its minimum for g = q’ and sub- 
stituting the above values of q’ it will be found that R*(q’) > 1 for n 2 7 so 
that for 0 <q S q’ we have that L(g) < 1 < R(q) and (61) cannot be satisfied. 

Turning to the range q” S q < 4 we write nq(L(q) — R(qa)) in the form (62), 
that is, 


n—1 
(65) ng(L(q) — R(q)) = p ((") _ 2n) pq 


i=) 
The only terms which are negative in the right-hand side of (65) are those for 


i = |landi = n — 1. Taking the latter first and comparing it with those for 
i =n — 2andi = n — 3 we write for the sum of these three terms 


n-1 — ( a= ( oa 2 2) 
np q<—-1+ (“= pee 2)$ > ( n 1) (n Ds 2) (4) , 
2 Pp 6 Pp 


and it is clear that the quantity inside { | is positive forn 2 7 and q” Sq < 4, 


p = 1 — q. Likewise combining the terms fori = 1 andi = 2 we find 


n n—- 1 ? 
npq’'<—1 + (s — — 2) pl 
2 q 
and it is clear that the quantity inside the | | is 20 for any p 2 q and n 2 7. 
It follows that equation (61) cannot be satisfied for n 2 7 and the above ranges 
of ¢q. 


For the remaining case n = 6 the only root p,; of Ge(pr) = () over the range 
-<cmsl q’;1—q Sm < 1 was determined numerically asp; = 0.5754 


and G¢ (.5754) < 0 verified by substitution in the left-hand side of (63). 


REFERENCES 

{1} L. H.C. Trererr, ‘On the extreme individuals and the range of samples taken from a 
normal population,’’ Biometrika, Vol. 17 (1925), pp. 364-387. 

2| Fb. 8. Pearson, “Some notes on the use of range,’’ Biometrika, Vol. 37 (1950), pp. 88-92 

3) R.L. Piackervr, ‘Limits of the ratio of mean range to standard deviation,’’ Biometrika 
Vol. 34 (1947), pp. 120-122 

1] S. Mortautt, “Extremal properties of extreme value distributions,’”? Ann. Math. Stat., 
Vol. 22 (1951), pp. 523-536 


’ 





SOME THEOREMS FOR PARTIALLY BALANCED DESIGNS 


By W. 8S. Connor anp W. H. CLAtTwortuy 


National Bureau of Standards 


Summary. This paper generalizes certain results which are known for balanced 
incomplete block designs and group divisible designs to partially balanced in- 
complete block (PBIB) designs with m associate classes. Some of the results 
are for general m but others are for m = 2, 3, or 4. 


= 


1. Introduction. Let N be the incidence matrix of a PBIB design with m 
associate classes. Then the determinant | NN’ | may be written as 


|NN’| = rk(r — 4)" +--+ (r — a)", ai, =@9— 1, ts m, 


where the z’s are distinct, r — z,, (u = 1,---, ¢), are factors of | NN’ |, and 
a,, (4 = 1,-+-+, t), are their respective multiplicities. For any m the factors, 
and for m = 2, 3, and 4 the multiplicities, are expressed in terms of the parame- 
ters of the design. 

For m general it is observed for v > b that | NN’ | is zero, which implies that 
one of the factors is zero, a slight modification of a condition of Nair [10]; and 
for v = b that | NN’ | is an integral square, a generalization of Shrikhande’s 
[11] and Chowla and Ryser’s [7] result for balanced incomplete block designs, 
and of Bose and Connor’s result for group divisible designs [3]. 


The special case m = 2 is studied at length, with calculation of | NN’ | for 
group divisible designs, which was first done in [3], triangular designs, and Latin 
square designs with 7 constraints. Corollaries to the general theorems men- 
tioned in the preceding paragraph are stated in detail, and several necessary 
conditions for v even and odd are developed from consideration of the integral 
nature of the a’s. These latter theorems are very useful in showing that certain 


sets of parameters which satisfy the necessary conditions given by Bose and 
Nair [4], and quoted in (2.2) below, do not correspond to constructible designs. 

For general m, lower bounds are developed for b. a generalization of Fisher’s 
work for balanced incomplete block designs [9], and of the bounds for group 
divisible designs [3]. Also, it is shown for any m that the factors of | NN’ | are 
nonnegative, which is obvious for balanced incomplete block designs and was 
shown for group divisible designs in {3}. 


2. The definition of a PBIB design. A PBIB design with m associate classes 
has been defined by Bose and Shimamoto [5] substantially as follows: 

A PBIB design with m associate classes [{m 2 1] is an arrangement of v treat- 
ments (varieties) in 6 blocks of k experimental units (plots) each such that: 

(i) Each of the » treatments is replicated r times, and no treatment appears 
more than once in any block. 


Received 3/14/53, revised 9/28/53. 





PARTIALLY BALANCED DESIGNS 


101 
(ii) There exists a relationship of association between every pair of the « 
treatments satisfying the following conditions: 


(a) Any two treatments are either first, second, - - 


= *,m). 


, or mth associates, and 
any pair of treatments which are sth associates occur together in exactly 4, 
blocks (s = 1, 2, - 

(b) Each treatment has n, sth associates. 


(c) For any pair of treatments which are sth associates the number of treat- 
ments which are simultaneously jth associates of the first and uth associates 
of the second is pj, and this number is independent of the pair of treatments 
with which we start. Furthermore pj, = piu; (j # u; 8, Jj, u 


= 1,2,-++,m). 
It is known that the following conditions are satisfied by the parameters 
v, b, r, ky Xa, Ao, + y Amy Mr, Me, ***, Mm, Diu, (8, J, U = 1,2, +--+, m), of the 
design: 


vr = bk, 


rik — 1) = > nr», 


a=] 


m 
v— | => n, 


ifi=j 
if i # j, 
(i,j,u = 1,2 
If m = 2, then clearly 


vr = bk, 


v—-l=nmt+m, rik — 1) = mAy + mero, 
pur + pis +1= Pu + Dia 


mi, 


1 1 2 2 

Pru + Por = Pu + Po + 1 Ne, 
1 2 1 2 

ny Pix = Ne Pu and ny Px» = Ne Pre. 

we shall require that A; # A», for if \y = 


de the design becomes a 
balanced incomplete block design which we do not wish to consider. 


When m = 2 


“; 


3. The value of | NN’ | for the general case. Consider the incidence matrix 
N of the general PBIB design, that is, 


Ni Mie 


where the rows represent treatments, the columns represent blocks, and n,,; = 1 
or 0 according as the ith treatment (7 = 1, 2, --- , v) does or does not occur in 
the jth block (j = 1, 2, --- , b). Since every treatment is replicated r times, 

ia b 

(3.2) i 


>=) 


(a 


l ”) 


“mH, 





102 W. 8. CONNOR AND W. H. CLATWORTHY 


and since every treatment must occur in A, blocks with each of its sth associates 
(s = 1,2,--+-+,m), if treatments 7 and wu are sth associates, then 


(3.3) Fog NijNuj = ro (§ #£ust,u= 


Hence the elements of the symmetric matrix NN’ are r in the principal diagonal 
and \,’s elsewhere. 

We now wish to evaluate | NN’ |. Since for a particular design r is fixed, and 
in our context we wish to determine | NN’ | for all r, it is convenient to :onsider 
the symmetric matrix M which is obtained from NN’ by replacing r with the 
variable z. The determinant | M | may be regarded as a polynomial of the vth 
degree in z. We shall determine the zeros of this polynomial and thereby the 
factors of | M |. We observe that the ith row (and column) of M contains the 
element z in the position of the main diagonal and by Section 2 the other v — | 
positions of each row (and column) are occupied by m, Ay’s, mz A»’s, «+ , and 
Nim \m’S. Hence if we add rows 2, 3, --- , v to the first row of | M |, then the 
elements of the first row are all 


(3.4) s+ >and; 


which we may factor out of the first row. Thus one zero of | M | is 


(3.5) tm = — Domai mrs, 
and therefore a factor of | M | isz — 2. 

We next consider the problem of finding the zeros of | M | from a different 
point of view. Let X be the column vector [x , 22, --:+, z,|. Then by a well 
known theorem from algebra, for 


(3.6) 
it is necessary and sufficient that 


(3.7) 


have a solution other than (0, 0,--- , 0). We shall seek the v linearly inde- 
pendent nonnull solutions, to each of which there corresponds a zero of | M 

Nair [10] and Bose [2] have shown when v > 6 that | A | = 0, where A is 
defined by (3.12) and (3.14) below, and we shall parallel Bose’s argument. If 
[x , 22, °** , Z| is a nontrivial solution, then by adding the v equations in (3.7) 
we get 


(3.8) (z — 2) Fa Xe = 0. 


Hence for z # 2%, 


(3.9) Din 2% = 0, 


where x; corresponds to the treatment 7. The excluded solution is X 
- , ec], where c is arbitrary and Xo corresponds to z 





PARTIALLY BALANCED DESIGNS 103 


Let us denote by S,(x;) the sum of the variables of the sth associates of the 


ith treatment. Then the v equations of (3.7) may be written as 


(3.10) 2x; + >. 2m A. S,(2;) = 0, (i 


Sum the equations (3.10) over the sth associates of the ith treatment and use 
the definition of a PBIB design and (3.9) to obtain 


{Ar par + Apes + °+* + AmPsm — Mere} Si(xi) 
+ 
fz + Ar per + Aopa2 + +++ + AmpPom — MA} S,(2i) 
+ 
+ {Ar per + Arprz + +++ + AmPem — MASLS,(2i) = O 


where s = 1,2,--> 


Let us set 


(3.12) Gu = Ar Par + AnPsa + +++ + AmPam — AM, 


Og = 2 + Ai pa + Az pre + anes + Am Pim — Ay Ms . 
Then the equations (3.11) are 
41 81(2;) + Ay So(xi) + +++ + Gin Sp (i) = O 


Ao; Sy(2;) + lo» So(x;) + a + om S(2;) = 0 


Qi S1(2i) + Ame Selti:) + +++ + AmmSn(2i) = 0. 


Without loss of generality we can assume that 2; # 0. Hence S,(x,;), (s = 1, 2, 
- ,m), are not all zero, since we have x; + 7 te S,(z;) = 0. This can happen 


if and only if 
(3.14) 


From (3.10) it is clear that (3.14) is necessary and sufficient for a nonnull solu- 
tion other than NX» of (3.7). Hence the 
LemMa 3.1. The distinct zeros of | M | are z and the distinct zeros of 


that 
(3.15) M (z — 2,4) (z2 — g,)' (z 


where 2, ,. 20, °°: ,2.(t < m) are the distinct zeros of A of (3.14), and 


(3 16) 2 1a, = 1 l, (a, > 0 





104 W. 5S. CONNOR AND W. H. CLATWORTHY 


When z = r, the definition of a PBIB design (Section 2) is satisfied. Further, 
M = NN’ and, by (2.1) and (3.5), z 
have the following theorem. 

THeoreM 3.1. For a PBIB design with m associate classes 


z = rk. Paraphrasing Lemma 3.1, we 


’ 


(3.17) | NN’| = rk(r — 21)" (r — 2)" ++ (r - 


ae 


24) 


’ 
where 2,22, °°* , Z(t S m) are the distinet zeros of | A | of (8.14) and ) Fed 1a, = 
y— | 
If the PBIB design is symmetrical, that is, v = b (or equivalently, r = k), 
then 


, where ay, (u 1,2,---,¢) 8 @ positive integer. 


(3.18) |NN’| =|N 


i? 


which must be an integral square since all of the elements of N are integers. 
Noting that rk = r°, we obtain the following corollary. 

Coro.uaRY 3.1.1. Yor a symmetrical PBIB design with m associate classes, it 
is necessary that 


(3.19) NN’ | = r(r — 2)"(r — 2) --- 


be an integral square. 


Ifv > b (i.e., r < k), then as has been pointed out by Nair [10] and Bose [2], 
NN’ | = 0, so that by (3.17), 


' 


(3.20) rk(r 21) (r Ze) *-+-(r— 2) ' = 0. 
Since r # Oand k # 0, it is necessary that r be equal to one of 2, 22, ++ 
Hence the following corollary to Theorem 3.1. 


Corouuary 3.1.2. For a PBIB design in which v > b, it is necessary that 
NN’'| = rk(r — 2)“"(r — 22) 


so that r is equal to one of 2, 22, °°° 


9 *t- 


In Sections 5 and 6, a,, a2,-+:, a, will be aetermined as functions of 2 , 


zy, ++, 2 (t S m) form = 2, 3, and 4. In the next section, for m general, we 
shall develop lower bounds for b, and shall show that r — z, (u = 1,--:, 0) 


is nonnegative. 


4. Lower bounds for |) and the nonnegativeness of the factors of | VN’ |. Lower 


bounds for the number of blocks in group divisible designs were developed in 


[3]. It was also shown for group divisible designs that the factors r — 2, and 
r z, of | NN’ | cannot be negative. In this section we shall extend these re- 
sults to PBIB designs with m associate classes. 

Since M is symmetric, there exists an orthogonal matrix C such that C’MC is 
the diagonal matrix which has as elements the roots of the secular equation 


(4.1) |M —yI| =0, 


where C”’ is the transpose of C and I is the identity matrix of order v [8]. But the 
roots of | M O are 2%, 2:,°** , 2 With multiplicities 1, a, «++ , a, respec 





PARTIALLY BALANCED DESIGNS 


tively, so that the roots of (4.1) must be yw = z — 4, = 
z — z With the same respective multiplicities. 

Since C is nonsingular, M and C’MC have the same rank, a fact which is useful 
in obtaining lower bounds for b. Thus, if z # 2z;, (¢ = Q, 1, 2, --- , ¢), then 
(4.2) tank M = », 
but if z = 2;, (2 ,1,2,---, ord), then 
(4.3) Rank M =v — a. 


Now for z = r, the definition of a PBIB design (Section 2) is satisfied, NN’ = 
M, and it is clear that 


(4.4) b 2 Rank N 2 Rank NN’ = Rank M. 


ifr = z, : 2,---,4t), then from (4.2) and (4.4) we obtain 


(4.5) 
and if r = 
(4.6) 


We summarize in the following theorem. 
THeoremM 4.1. For a PBIB design with m associate classes, if r #7 2, (u 
1,2,---,t), thenb 2 0, butifr =z, (u = 1,2,---,ort), hnb2zav—a,. 
If the design is resolvable (i.e., consists of r sets of b/r blocks each, b/r an 
integer, where a set of blocks contains every treatment once each), then these 


inequalities may be improved. In this case the columns of N may be arranged in 
r sets of b/r columns each, where a set of columns is such that | occurs once and 
only once in each row of the set. By adding the second, third, --+ , and (b/r)th 
columns to the first column of a set we obtain a column with all 1’s. Since there 
are r sets, it is clear that 


(4.7) b — (r — 1) 2 Rank N. 

Using (4.7) with (4.2), we obtain 

(4.8) b v-+r l, 

when r ¥ 2,, = 1,2,---, 4), and (4.7) with (4.3) we obtain 
(4.9 b2zv—-am+r-—1, 


when r = z, . We summarize these results in the following theorem. 
TuroremM 4.2. For a resolvable PBIB design with m associate classes, if r 7 2. , 
1,2,---,t),thenb2v+r41, butifr = z,,(u = 1,2,---, ort), then 
v—-a tr— il. 
We next show that r — z,, (u = 1, 2,---, ¢), is nonnegative. Let z = 1, 
so that NN’ = M, and suppose that N exists satisfying the definition of a 
PBIB design. Since M = NN’ is nonnegative, and since the transformation 





106 W. 8. CONNOR AND W. H. CLATWORTHY 


matrix C does not alter this property, the roots of (4.1) are nonnegative. We 
have proved the following theorem. 

Tueorem 4.3. For a PBIB design with m associate classes, r 2 z, , (u = 1, 2, 

- £). 

5. Partially balanced designs with two associate classes. In this section we shall 
treat partially balanced designs with two associate classes (m = 2). From 
(3.12) and (3.14) it is seen that 


(5.1 jz + Apu + Ao Pia — Aim Apu + he Piz — AN 
o.L) : = | 


| Ai pas + Ao Ps — Ae 2+ Ai pot + Ae Pa — AzNe 


By use of (2.2) we may express | A | in terms of z, \i, Az, Pi2 and Pia - After 
adding the second row of determinant | A | to the first row, expanding, and 
collecting terms according to powers of z, we obtain 
| A | = z + [Ay - do) (pia ” Diz) — (Ai + de) 2 
+ [(i - 2) (Ae nes Aspiz) + Arde] = O. 


If we let 


» 6 2 1 1 2 
(5.3) Y = Pr—- Pr; B = peo + Pr 
and 


A=7' + 28 +1, 


then the roots of (5.2) are 


(5.4) Zu = BOL — Ae) (— + (—)"V/a) + On + Ad)I;, 
We observe that 
(a) A > Oso that z; ¥ 2, and 
(b) 21 < zo ify > A» = O, but z, > zw if 0 <r, « 
By (a), ¢ = m = 2 in (3.17). 
Let us next determine the exponents, a and a, of (3.17) in terms of the 
roots, 2; and z , of | A | = 0. When m = 2, 


| M = (z 2o)(z — 2)""(z — 20)%?, 
where 
(5.6) a t+a=v—t. 


Expanding the factors of | Mand collecting the coefficients of the powers of z 
we obtain from (5.5), 


(5.7) M | = 2" — (2 + az + aeze)2” + -°-> 


Again, expanding | M | by its diagonal elements [1], we see that the coefficient 
» w-l > . ~~ 
of 2° is zero. Hence from (5.7) 


(5.8) 210, + 22a. = —%. 





é 


PARTIALLY BALANCED DESIGNS 


Solving (5.6) and (5.8) simultaneously and using (5.4), we obtain 


a= [vz9 + (2 - z2)] (Z2 — 21) 


(5.9) ; ’ 
= [(v — 1)(—y + VA + 1) — 2mi)/2Va 
and 
a= [vz + (zm — 21) |/(a — Zo) 
= [(v — 1I)\(y + VA + 1) — 2n)/2V A. 


When z = r, the definition of a PBIB design is satisfied and M NN’. We 
thus have the following theorem which is a special case of Theorem 3.1. 


(5.10) 


THeoreM 5.1. For a PBIB design with two associate classes, it is necessary that 


(5.11) NN’ = rk(r ” 2)"(r - Z2)"? ay + ao = V hs 


where z, and 2, are given by (5.4), and a, and ay are given by (5.9) and (5.10) 
Furthermore, 2, and 2, are distinct and a, and ay are positive integers 

The positive integral condition on a, and ay (a; + a, = v 1) is useful in 
showing that some sets of parameters which satisfy the necessary conditions 
(2.2) have no solutions. From (5.9) and (5.10) it is seen that a; and a, depend 
only upon the parameters nm, , ne, Pie ; and pis of the design. Useful computa 
tional formulas for a; and a, are provided by (5.9) and (5.10) 

We now have a special case of Corollary 3.1.1. 

Corouuary 5.1.1. For a symmetrical PBIB design with two associate classes, 
it is necessary that 


(5.12) NN'| = rr - \"(r — z)%*, ay + a 


be an integral square. 
When v > br = z, (u = lor 


2) 


, 80 that by (5.4), A is an integral square. 
Hence we have a special case of Corollary 3.1.2. 

Corouuary 5.1.2. For a PBIB design with two associate classes and 1 
it is necessary that 


(a) NN’ | = rk(r 


so that either r 2,,0rr zo, and 


l ~“2 > 


(b) A be an integral square. 


We shall next prove corollaries for three special types of partially balanced 


designs with two associate classes. The first and perhaps most important of 


these types is known as the group divisible design which has been rather fully 
developed by Bose and Shimamoto [5], Bose and Connor [3], and Bose, Shrik 
hande, and Bhattacharya [6]. For these designs 


mn, mn=-nri- 


i 


nm — L)Ay + nm 


Pi2 = 0, 





108 W. S. CONNOR AND W. H. CLATWORTHY 


where m and n are positive integers not less than 2. By (5.4), (5.9), and (5.10) 
(5.14) 21> —n(r, — dhe) +A, Z =r 
(5.15) a =m— il, and a. = m(n — 1), 


so that we have the following corollary. 
Coro.uary 5.1.3. For a group divisible design it is necessary that 


(5.16) |NN’| = rkir — dy + nQa — ds)" Ir — a”. 


This result was obtained by Bose and Connor [3]. 
The second type of partially balanced design developed in [5] is known as the 
triangular design. For the triangular design 


v = n(n — 1)/2, ny = 2(n — 2), no = (n — 2)(n — 3)/2, 
(5.17) : 2 
Pr =n — 3, Pw = 2(n — 4), 
where n is integral and greater than or equal to 4. From (5.4), (5.9), and (5.10) 
it is seen that 
(5.18) 4a = (4 -= n)dy i (n — 3)r2, 2= 2A; — Ae ° 
(5.19) a1 =n-— Il, and a = n(n — 3)/2, 


so that we have the following corollary. 
Coro.uary 5.1.4. For a triangular design it is necessary thal 


(5.20) | NN’| = rk[r + (n — 4). — (n — 3)Ag]” [rv — 201 + AQ)” 
A third type of partially balanced design with two associate classes defined 
in [5] is the Latin square type with 7 constraints. For this type of design 


2 . ° 
v=N, nm = i(n — 1), mg = (n — 1)(n —7i+4+1), 
(5.21) 


Pi2 = (4 — 1)\(n — 2+ 1), Dis = i(n — 2), 


where n and 7 are integers and 2 < 7 S n. Again, from (5.4), (5.9), and (5.10) 
we obtain 
= (7 - n)(Ayq — As) + re, 2 >= u(Ay — Ao) + Ae, 
a = i(n — 1), and a = (n — 1)\(n —7i+ 1), 
from which we have the following corollary. 
Coro.uuary 5.1.5. For the Latin Square type of design with i constraints, it is 


necessary that 


| NN’ | = rk{r = (7 — n)(ry —_ do) - do} ' a 
(5.24) ° n—1)(n—t+. 
[r —_ Ay — As) ~— Ao] 
Next let us consider the special case of a PBIB design with two associate 
classes in which a, = a, . Setting the right members of (5.9) and (5.10) equal to 
xach other and recalling that v — 1 = n, + nm , we obtain 


(5.25) y = (mz — ™m)/(m + ne). 





PARTIALLY BALANCED DESIGNS 


Since nm and n, must both be positive integers it follows that 

(5.26) -—1 < —m/(m + me) < yy = (me — m)/(m1 + me) < ne/(m + me) < 1, 
from which it follows that 

(5.27) y = 0, or pis = Pia ’ 


since y must also be an integer. Hence, by (5.25), and (2.2), 


(5.28) nm = nm = (v — 1)/2. 


Again, using (2.2) and (5.28) it is seen that v is of the form 4¢ + 1, where t = 
Piz = Piz = pu. Since a; = a, and a + a = v — 1, 


(5.29) a, = a. = (v — 1)/2 = 2t. 
From (5.3), (5.27), and (5.29) 
(5.30) A=v= 4+ 1. 


This completes the proof of the following theorem. 
TuHeoreM 5.2. If in a PBIB design with two associate classes a, = a, then 


(a) pi2 = pis = t, 

(b) = a; =m = (v — 1)/2 = 2t, 
and 

(c) v=A=4+1, 


where ¢ is a nonnegative integer defined by (a). 

It is known that any integral square must be of the form 4p or 4p + 1, pa 
nonnegative integer. However, a; = a, does not imply that A is an integral 
square. In fact, designs having solutions exist with a; = a, and A not an integral 
square while others having a; = a2 and A an integral square also have solutions. 

Let us next consider a partially balanced design with two associate classes in 
which v is odd and A is not an integral square. Let » be defined by 


(5.31) n = [(v — Il — y) — 2m)/2V/A. 
Whether v is odd or even (5.9) may be expressed in the form 
(5.32) a = (vy — 1)/24+ 7. 


Since v is odd, (v — 1)/2 is integral, and, since a must also be integral, 7 must 
be integral. Since A is not an integral square, the only way 7 can be integral is 
for n to be equal to zero, that is, 


(5.33) (vy — 11 — y) — 2m, = O. 


Using v — 1 = nm + mz in (5.33) it is seen that (5.25), (5.26), (5.27), (5.28), and 
(5.29) follow. Hence the following theorem. 





110 W. S. CONNOR AND W. H. CLATWORTHY 


Tureorem 5.3. If in a PBIB design with two associate classes v is odd and A is 
not an integral square, then it is necessary that 


(a) Diz = Dv = Dit = {, 

(b) mh = No = a = oo = (v — 
and 

(c) v= A = 4t+ 1, 
where / is a nonnegative integer defined by (a). 

Now if v is odd and A is an integral square (5.32) holds and 7 must be an in- 
teger. Thus, we have the following theorem. 

THroreM 5.4. If in a PBIB design with two associate classes v is odd and A is 
an integral square, then it is necessary that n be an integer, where A is defined by 
(5.3) and n is defined by (5.31). 

Finally let us consider the case of a PBIB design with two associate classes in 

which v is even. Then (5.9) can be written in the form 
(5.34) a, = 4[v — 1 + Qn]. 
Since a; must be integral, v 1 + 2n must be an even integer. But v — 1 is 
odd, and so 27 must be an odd integer. Since v, y, and n; must all be integral, 
it is seen from (5.31) that A must be an integral square. This proves the follow- 
ing theorem. 

TororemM 5.5. If in a PBIB design with two associate classes v is even, then it is 
necessary that 

(a) A be an integral square, and 
(b) 2n be an odd (positive or negative) integer, 
where 4 and n are defined by (5.3) and (5.31) respectively. 

6. Partially balanced designs with three and four associate classes. In this 
section we shall obtain expressions for a; , a2, --* , a, (t = 3, 4), in terms of 
the roots z;, 22, °°: , 2 of |A| = O when the z;, (¢ = 1, 2,--- , @), are all 
different. First, we shall discuss the case of partially balanced designs with three 
associate classes (m 3). 

When m = 3 we obtain from Lemma 3.1 
(6.1) M |\\= (z — z)(z 
wherein the 2; , (2 , 2, 3), are the distinct zeros of | A | and 
(6.2) ‘ =y-— l, (a; > Q). 


Expanding each factor of | M | of (6.1) and collecting coefficients of powers of z 
gives 


i=l 


|M|= Zo Yaz, | 2” 


3 3 
- >) ala; — 1)z; 4 > Qj a; 2 z| 2 


v 
(—)"20 21 222s . 





PARTIALLY BALANCED DESIGNS 


Again, expanding | M | of (6.1) by its diagonal elements [1], it is seen that 


3 
(6.4) M\|=2 - (> noni) Fees + | Mo| 


where M, is obtained from M by replacing z by zero. 
Equating the coefficients of 2” in (6.3) and (6.4), we obtain 


(6.5) _ a; 2; 


i=l 


while equating the coefficients of 2’ * gives 


3 


3 3 
(6.6) 2% >. apzit+ a a(a; — 1jzi +2 > A; A; 242; = 


i=l i=] i,j=l 
i<j 


Now 


3 2 3 3 
(6.7) (x aizs) = > az; +2 = Oj Oj 24 2; . 


t=] tel t,jal 
‘<j 


By use of (6.5), (6.6), and (6.7) we obtain 
3 a 
(6.8) > az = —% + v > ninri. 
t=-_1 t= 1 
Thus (6.2), (6.5), and (6.8) comprise a system of three nonhomogeneous linear 


equations in unknowns a , a, and az, 


ay + Qe +- ay = ky 
2101 + 2202 + 2303 = he 
2 2 2 
2104 -+- 2209 + 2303 = k 


a> 


wherein the coefficient matrix is the Vandermonde matrix 


(6.10) 


whose determinant is 


(6.11) A; - 24 21) (23 — 2)(Z3 — 22), 


and 


. tok ee =" 3 2 
(6.12) ky =v —1, he —Z%, ks = —20 + v> iat Mid; « 


If 2, , 2, and 2; are all distinct, then (6.9) has a unique solution. In fact, we 
obtain 


3 
(6.13) 7 E +2 nn] =n h 


Pz inl 
™ (22 — 21)(@ — 2) 





112 W. 8. CONNOR AND W. H. CLATWORTHY 


and from (6.9) it is clear that the corresponding expressions for az and a3 can 
be obtained from (6.13) by cyclically permuting the indices 1, 2, and 3. 


For partially balanced designs with four associate classes the above procedu:e 
leads to 


a IL e- 2) = of a+ (2 «)(D mat) - x] +1 @ - 


> 5 i=l i=l t= i=l 
(6.14) inj ij ixj ix) 


where 


4 
> pir & jks 
ke 


J. heme 1 


i | 


0 


he is 


and 2; , 22, 2,, and z% are the distinct roots of | A | 


REFERENCES 
[1] A. ©. Arrxen, Determinants and Matrices, 5th ed., Oliver and Boyd, 1948 
(2) R. C. Bose, ‘‘A note on Nair’s condition for partially balanced incomplete block de 
signs with k > r,’’ Calcutta Stat. Assn. Bull., Vol. 4 (1952), pp. 123-126 
[3] R. C. Bose anp W. 8. Connor, ‘‘Combinatorial properties of group divisible incom 
plete block designs,’’ Ann. Math. Stat., Vol. 23 (1952), pp. 367-383 
{4] R. C. Bose ann K,. R. Natr, “Partially balanced incomplete block designs,’’ Sankhyd, 
Vol. 4 (1939), pp. 337-372 
R. C. Bose anp T. Suimamoro, “Classification and analysis of partially balanced in 
complete block designs with two associate classes,’’ J. Amer. Stat. Assn., Vol. 47 
(1952), pp. 151-184. 
R. C. Boss, 8. 8. SurikHanpe anv K. N. Buarracnarya, “On the construction of 
group divisible incomplete block designs,’’ Ann. Math. Stat., Vol. 24 (1953), pp 
167-195 
8S. Cuowna ann H. J. Ryser, “Combinatorial problems,’’ Canadian J. Math., Vol. 2 
(1950), pp. 93-99 
H. Cramtr, Mathematical Methods of Statistics, Princeton University Press, 1946, p 
113. 
R. A. Fisuer, “An examination of the different possible solutions of a problem in tn 
complete blocks,’”’ Ann. Eugenics, Vol. 10 (1940), pp. 52-75 
K. R. Nair, “Certain inequality relationships among the comb meena parameters 
of incomplete block designs,’’ Sankhyd, Vol. 6 (1943), pp. 255-259 
8S. 8S. SurrkHanpe, “The impossibility of certain symmetrical balenasd incomplete 
block designs,” Ann. Math. Stat., Vol. 21 (1950), pp. 106-111 





ON THE REDUCED MOMENT PROBLEM 
By Satem H. KuHamis 


Economic Research Institute, American University of Beirut 


1. Summary. For a special class of cumulative distribution functions which 
are solutions of a given reduced moment problem (cf. paragraph 3, pages 27 
and 28, of [4]), the well known expression for the least upper bound of the 
absolute difference between any two solutions of the same reduced moment 
problem is improved upon by the introduction of a constant nonnegative 
multiplier which is smaller than unity in the case of the special class of solutions. 
Useful properties of the determinantal form of the classical expression for the 
least upper bound are derived. The numerical value of the constant multiplier 
is computed in the case of a well known class of cumulative distribution functions. 

In addition, a simple method is given for constructing, over a finite range, an 
infinite set of continuous and differentiable cumulative distribution functions 
which are solutions of the same reduced moment problem when one such solution 
is known. The new expression for the least upper bound, when applied to 
members of the constructed class of continuous solutions, may be helpful in 
deriving general, but crude, inequalities among orthogonal polynomials over a 
jinite interval. 


2. Introduction. Let (x) be a cumulative distribution function (cdf) (by 
this we mean a nonnegative, nondecreasing function, which need however not 
be normalized) defined on the interval a S x S b, where either or both of a 
and b may be infinite. If either (or both) of a and b is (are) finite, we may speak 
of the range as being infinite provided we define 


for z< 


0 
(1) #(z) = 


Ho for 2z = 


We assume further that 
(i) (zx) has at least n + 1 points of increase in the interval [a, }]. 
(ii) The moments of (x), defined by the Stieltjes integrals 


bp = [ x’ d&(x) 


[ x’ d®(zx) 


exist for r = 0, 1, 2, --- , 2n. 
Let ¥(z) be another cdf defined for a’ S x S b’ and having all the properties 
of @(z) mentioned above (with obvious modifications.) If the corresponding 


Received 3/24/53. 





114 SALEM H. KHAMIS 


moments of @(x) and W(x), for r = 0, 1, --+ , 2n, are equal respectively, then 
we have the inequality 


(2) | (x) — Wz) | S {Qralz)Q.(z) — Qaarilx)Qi(xz)} = p,(z), 


say, where Q,(x), (r = 0, 1, 2, --- , n), is a polynomial of exact degree r given 
by the denominator of the rth convergent of the continued fraction associated 


a 


with the integral | [do(x)/(x — 2)| or [ [dv(x)/(z — t)|] and possessing 


—x 


the following three properties: 


0 for r 


(3) [ Q@)Q@) do) = | Q@)Q) ava) = | 


ice for r 

where c, is a positive constant, 

and Qr+i(z)Q4x) — Qrii(z)Qr(x) > 0, 

(4) Q(x) = (aa + B,)Q1(4) — Q,-2(2), vr = 1, 2,3, --- n, 


where a, and 8, are determinable coefficients independent of x and where 
Qi(z) = 0, Q(x) = Land a = 1/po. 

Inequality (2) is easily deducible from the well known Tchebycheff inequalities 
[1]. A proof of (2), based on a method due to Stieltjes [8], is given by Uspensky 
in appendix II of [10}. 

The function p,(x) appearing on the right-hand side of (2) has also been 
expressed in forms (a) and (b) below [4] (cf. [4], pp. 42-44 and p. 72 for deriva- 
tion and equivalence of forms (a) and (b)). 

(a) 


(5) pn(x) = { > uo we(x)}* 


where w,(x), (r = 0, 1, 2, --- n), is the orthonormal polynomial of exact degree 
r associated with d&(x) or d¥(x), that is, with the moment sequence {u,}, 
r= Q0, 1,2, --° , 2n. 

(b) 


(6) p,(x) = —A,/D,(z), 
where 
Ln | 


Mn+1 | \ : 
2 = | Mi+i |i,j—0,1,2,--+,n 





REDUCED MOMENT PROBLEM 


x” Mn Bn+l 


The equivalence of (2) and (5) may be established by the properties (3) and 
+) of Q,(x) and the well known properties of the orthonormal set w,(x) (ef. 
pp. 41-42 of [9], in particular, equations 3.2.1 and 3.2.4). 


/ 


3. Some properties of the determinants A, and D, (x). Two properties of 
A, and D,(x), believed to be new, are derived in this section. These properties, 
especially useful in the numerical evaluation of the two determinants, are given 
as two theorems. 

THEOREM 1. The determinant A, is an arithmetical invariant under a trans- 
formation of the origin of moments, that is, if 


pla) = [ (x — a)* d&(zr) 


o 


then 


An — | Pee) | Mi+;(@) | i,jm0,1,2,..., 


for any arbitrary real number, a.” 

Proor. We apply to the determinant A, in (7) the following two difference 
operations. 

(a) For each element u,,; in the (k + 1)st row of A, substitute 


oe hk 
' o FER te 
(9) Ve, j(a) = / a(x — a)" do(x) = >> (—1) + Q" pe-rsj- 
— 2 Tre!) 
Obviously such a substitution leaves 4, unchanged in value as it merely adds 
to the (k + 1)st row a linear sum of multiples of the preceding rows. 
If this difference operation is applied first to the (n + 1)st row, then to the 
nth row and so on, the determinant A, will retain the same value. Thus we 
have 


(10) An = | Bi+j = | vi+;(a) |i, jee ,1,2, ove 


1 J. Geronimus, in his paper, ‘‘On some persymmetric determinants formed by the 
polynomials of M. Appell,’’ J. London Math. Soc., Vol. 6 (1951) pp. 55-59, obtained indirectly 
and as by-products of a solution to an extremal problem sesults equivalent to Theorems 1 
and 2 given in the present paper. Geronimus assumes an absolutely continuous cumulative 
distribution function but his proofs apply equally well to the general case treated above 
The author is indebted to Dr. H. P. Mulholland of the University of Birmingham who in 
a letter to the author, dated 25 October 1953, outlined the relevant results of Geronimus. 





116 SALEM H. KHAMIS 


(b) For each element v;,,(a) in the (k + 1)st column of (10) substitute 


k 
(1 1) £i44(a) - a ce 1)" (‘) a’ visn-r(a) 
and apply this substitution first to the (n + 1)st column, then to the nth column, 
and so on. The resulting determinant is again equal to A, as each column is 
replaced by a linear sum of multiples of the preceding columns. 
The element £;,;(a) in the resuliing determinant is, in virtue of (9), (10) 
and (11) equal to yu;,;(a). Hence, 


(12) An = | Mi+;(a) | é,j—0,1,2,. omy 


as required. 
THEOREM 2. The determinant D,(x) may also be expressed in the form 
— | wigs(%) |4,jmt,2,3,....n - That is, the following relation holds identically for all z, 


| po(ar) pa(x) +++ pings (x) 


(13) ua(z) pala) moe Hn42() 


| 

| 

; ; aoe 

> re ; 5 | Mngt) png) -**  peon(a) | 
x Mn Mn+ oe Min 


where uw; = us(O) for i = 0, 1, 2, «++ 2n. 

Proor. Applying two differencing operations to the left-hand side of (13) 
similar to the differencing operations used in the proof of Theorem 1, replacing 
a by «x throughout (with obvious modifications due to the difference between 
the orders of the determinants D,(x) and A,(x)) relation (13) may be easily 
established. 

In view of Theorems | and 2 one obtains a new expression for p,(x) given by 


(14) pn(X) os | Mi+;(@) | é, jm0,1,2,-000/ | Hi+5(2) | 4, ja0,2,3,..008 is 


where a is any real number and where again 


uz) = [= 2) dol. 


The inequality 


(15) | d(x) — W(x) | S pn(z) 


where p,(x) is given by any of the expressions specified above, gives a gauge 
of the error involved in approximating to an unknown cdf by another known 
cdf whose moments, up to order 2n, are equal to the corresponding moments 
of the unknown function. However, the gauge given by (15) is usually much 
larger than the actual error as is well known in practical problems of this type. 
We shall show below that under certain conditions inequality (15) can be im- 
proved upon. 





REDUCED MOMENT PROBLEM 117 


4. Improvements upon inequality (15). Let (x) and W(x) be defined as in 
Section 2. We need only consider the case when #(z) and ¥(z) are not identically 
equal, as the special case is trivial. Consider the two functions 


(16) &(z) — U(x) + AV(z) 
and 
(17) AV (x) 


where A is a positive constant chosen so that (16) is a cdf with at least n + 1 
points of increase. As both (zx) and W(x) are never decreasing functions, A 
need not exceed unity. Thus we may write the condition 


(18) 0<AS1. 


Since (x) and V(x) have identical moments up to order 2n, the functions 


(16) and (17) have also identical moments up to order 2n. Therefore, the func- 
tions 


(19) [o(z) — (1 — A)W(x)|/A = %,(2), 
and 
(20) V(x) 


have equal moments up to order 2n, which in turn are identical with the corre- 
sponding moments of #(z). Further, the functions (19) and (20) are nonnega- 
tive, never-decreasing, and have at least n + 1 points of increase each. There- 
fore, by (15), we have 


\®(x) — V(x)| S pn(x) 
which reduces to 
(21) \b(x) — W(x)| S Ap, (x) 


where p,(x) is given by any of the expressions (2), (5), (6) and (14) of Section 2. 
Similarly, by considering the functions 


f(z) — @(2) + Bole) 


and (x) we obtain the inequality 
(22) (x) — (x)| < Bp,(z) 


where 0 < BS 1. 

When A and B are each equal to unity, inequalities (21) and (22) will be 
identical with (15). However, when it is possible to choose either or both of A 
and B less than unity, then we have an improvement upon inequality (15) (see 
Section 5 below). When both of A and B may be chosen less than unity, the 





118 SALEM H. KHAMIS 


inequality involving the smaller of the two constants naturally would lead to a 
better improvement. 

When the functions (7) and ¥(z) are assumed to be continuous and differ- 
entiable for all z in the ranges [a, b] and [a’, b’] respectively, (at a, a’ and b, b’ 
continuity and differentiability only on the right and on the left respectively 
are assumed) (in which case each of the functions ¢(z) and W(x) possess an 
infinite number of points of increase), one may obtain the smallest positive 
number that could be assigned to the constant A as follows. 

Since A must be chosen so that &;(z) becomes a nonnegative, never-decreasing 
function with at least n + 1 points of increase, we must choose A so that 


f'(xz) = O'(xz) — W(x) + AV’(x) 2 0 


for all x, and such that f(x) has at least n + 1 points of increase. The minimum 
value of A which makes f(x) a nonnegative, and never-decreasing function is 


(23) Ay = 1 + L.u.b.(—#'(z)| ¥’(z)) 


where the least upper bound is taken over min (a, a’) S x S max (b, b’) provided 
—’(x)| W(x) is bounded above. 

If we choose A = Ay + e€ where « is an arbitrary positive number, then f(z) 
will have more than n + 1 points of increase. Hence 


| (x) = W(x) s (Ag + €) pn(X) 
and since ¢ is arbitrary, we have, 


(24) (x) — V(x)| S Aopn(z) 


where A is given by (23), provided —’(x)| ¥’(x) is bounded above. That f(z) 
has an infinite number of points of increase when 0 < A = Ag S 1 is, in fact, 
obvious because ®(z) and V(x) are distinct con(inuous and differentiable cumula- 


tive distribution functions possessing equal moments up to order 2n [10}. 
Similarly we may choose the minimum value of B given by 
(25) By = 1 + Lu.b.(—W’(2)/’(z)) 
exa<d 
provided —W’(x)/’(x) is bounded above. 
Therefore, we may write instead of (15) the inequality 


(26) | d(x) — W(x)| S min (Ao, Bo)pn(z)) 


where Ay and By are given by (23) and (25). 
The above results may be summarized by the following theorem.” 
TuHeoreM 3. If ®(xz),a S x S band V(x), a’ S x S Db’ are two nonidentical 
continuous and differentiable cumulative distribution functions whose correspond- 
ing moments up to order 2n exist and are equal and if at least one of the ratios 


2 Theorem 3 and its proof were first given, in a slightly modified form, in an appendix 
to a Ph.D. thesis submitted by the author to the University of London in May, 1950. 





REDUCED MOMENT PROBLEM L1i9 


—'(x)/W'(x) and —W’'(x)/%'(x) is bounded above for c S x S d, where 
c = min (a, a’) andd = max (b, b’), then 

| d(x) — W(x)| S Cp, (x) 
wher 


0<C =1 +4 min /Lu.b.(—#'(z)/¥'(2)), Lu.b.(—W’ (x) /®’(z))\ 
\eszsd } 


es2zs ) 
and where 
Pn(x) = | Mi+j\i,j—0,1,2,...,0 | poc4j(Z)|¢, jent.2,...0" 


and yu, and u,(t) are the rth moments about the origin and t respectively. 
A second theorem, which leads to an improvement of inequality (26) when 
each of Ao and By exists and is less than unity, may be stated as follows. 
TueoremM 4. If (x) and V(x) satisfy the conditions of Theorem 3, and if both 
of Ay and Bo of equations (23) and (25) exist, then 
(27) | d(x) — W(x)| S Kp,(x) 


where 


(28) 0 < K = ApBo/ (Ao + By —_ AopBo) < min (Ay 9 B») s :. 


Proor or THEeoreEM 4. The two functions 
®,(x) = {b(r) — W(x) + Ap¥(x)}/Ao 


and 
VW, (x) {W(r) — B(x) + Byb(x)}/Bo 


satisfy all the conditions for the application of inequality (15) and possess the 
same moments up to order 2n as the original functions (7) and ¥‘x). Hence 


| Py (x) _ WV;(z)| = (Ag ot By = AyBy)| (x) - V(zx)| ‘(ApBo) s pr(x). 


Since 0 < Ap S 1 and O < By S 1, then (Ap + By — AoBo)/AoBy > O and 
therefore 


| (x) — V(zx)| = { Ap Bo/(Ao te By = AoBo)} pr(x) = Kop,(z) 


as required. 

If either Ay or By is equal to unity, inequality (28) reduces to inequality (26). 
However, if each of A» and B, is less than unity, inequality (28) is an improve- 
ment upon inequality (26) since, in this case, 


O< K = ApBo/ (Ao ay By = ApBo) < min (Ay ’ Bo). 


Examples are given in the following two sections which show the existence 
of classes of edf’s for which inequality (27) is an improvement upon inequality 
(15). At most (even when the least upper bound of the ratio —®’(x)/W’(x) and 





120 SALEM H. KHAMIS 


that of its reciprocal do not exist) the maximum value that both A» and By can 
take is unity. 


5. An application of inequality (27). As an illustration of a case when all the 
constants A» , By and K exist and each is less than unity let us apply inequality 
(27) to the class of cumulative distribution functions given on page 106 of [2]. 
In particular consider any two cdf’s of this class defined by 


(x) | ke~*"(1 + « sin (St tan Aw)) dt 


“0 


W(2) = | ke~**(1 + ein (8? tan dx)) dt, 
“0 
where 0 S tS ~;k >0;a>0;0<A < 3;8 = a tan Aw; and —-1 <e& < 
€) £ ‘. 
The two edf’s (x) and W(x) are distinct and have equal moments of all orders 
irrespective of the distinct values of « and ¢: in the open interval (—1, 1). 
Applying inequality (27) one obtains the inequality 


| B(x) — (z)| < K/h wi (2) 


with Ao = (4. — &)/(1 + 4), Bo = (a — e&)/(1 — e&), and K = 4(e — «e), 
and where w,(x) is the orthonormal polynomial of degree r associated with the 
given distribution functions. The series yD w(x) converges as n —> © be- 
cause the two distributions are distinct. Obviously inequality (27) applies in 
this case and gives improved limits for | #(z) — ¥(x)|, in comparison with in- 
equality (15). In this case both A» and By , and therefore K, exist. Other appli- 
cations are given in the following section. 


6. Class of cumulative distribution functions possessing moments equal up 
to a specified order to those of a given cdf. We consider the case of a cdf which 
is continuous and differentiable and has a finite range. The extension to a cdf 
with a finite number of points of increase is simple. However, I have not suc- 
ceeded yet in extending the following results to the case of a cdf with an infinite 
range. 


Let F(x) = I f(t) dt be any continuous and differentiable cdf with mo- 


ments u,,r = 0,1,-::, anda s z S b, where both a and b are finite. Let 
p(x), r = 0, 1,---, be the set of orthogonal polynomial over the range [a, b] 
associated with f(x) as a weight function. Then we have the following theorem. 


TueoreM 5. For all « such that |e| S 1, the class of cumulative distribution 
functions 


(29) P.(z,«) = / fUt)(+ epasilt)/Lns) a, ese 
where 
L, = l.u.b. | p,(x)| and i=1,2,-::, 


aszsb 


possess the same moments up to order n, provided the required moments exist. 





REDUCED MOMENT PROBLEM 121 


The proof is obvious since | ep,+:(x) |/Ln4i S 1 and because of the orthogonality 
property of the polynomials p,(x). Over an infinite range the polynomials p,(x) 
are not bounded and therefore the theorem does not hold. 

It is to be noted above that F,(z, 0) = F(x) and theref..c the set F,(z, e) 
possesses the same moments as F(x) up to order n. Thus given any cdf over a 
finite interval which possesses moments up to order 2n + i with i 2 1, one 
may construct by Theorem 5 an infinite number of cumulative distribution 
functions which possess the same moments up to order n as the given edf. The 
existence of the moments up to order 2n + 1 implies the existence of the asso- 
ciated orthogonal polynomial pn4:(z). 

The class of cumulative distribution functions F(z, ¢) provides a suitable 
set for illustrating the improvement obtained by introducing the constant K 
in inequality (27). As an illustration consider the two functions F,,(zx, 0) and 
F(x, «:) where «, # 0. Applying inequality (27) to these two functions one 
gets Ap = |a, |/(1 +1 a|), Bo = |e:|,and K = | «, |/2, 
and therefore 


(30) | Fon(a, €:) — Fan(x, 0)| S | €1|/(2 > Poo we (x)) 


where 


w(x) = pita) / f pr(x) f(a) de. 


This represents a reduction in the bound for the absolute difference which is 
at least equal to half the bound given by inequality (15). 

Incidentally inequality (30), upon substitution from (29) reduces to a general 
inequality among the orthogonal polynomials for all sets of orthogonal poly- 
nomials over a finite interval, say [a, b]. In this particular case the inequality 
is given by 

| sz | 1 
(31) | (Dinsa(t)/Lonss)f(t) dt| $ 4-=—— .. 
™ w,(2) 
rl) 

Applying inequalities (27) to other pairs of the class F2,(z, €) one can obtain 
other general inequalities among orthogonal polynomials. Of course, because 
of the generality of the class F'2,(z, €), these may be expected to be rather crude, 
in the sense that the right-hand side of (31) is in special cases (e.g. Legendre 
polynomials) much larger than the left-hand side in (31) (ef. [9], chapter 7, 
p. 154). 

In the particular case when p,(x) is the Legendre polynomial of order r de- 
fined over the interval [—1, 1] by 


pT eee et Bees 
| Pr(x) p(x) dx = 
mt 0 ifr # 8, 





122 SALEM H. KHAMIS 


inequality (31) becomes 
| Pons2(%) — pon(x)| S (4n + 3)/D Po (2r + 1)pi(zx), 


which is not as strong an inequality as those known in the case of Legendre 
polynomials. 

Theorem 5 is particularly of interest in respect of the prevailing practice of 
fitting a Pearsonian cdf to an unknown cdf when the ranges of the Pearsonian 
edf is finite. If the fitted cdf is denoted by F(x) and if say the first four mo- 
ments have been fitted, then any member of the class F;,,(z, €), with | «| < 1, 
i = 1, 2,---, has the same first four moments as the unknown cdf. There 
is no indication, however, that the fitted Pearson cdf gives a better approxima- 
tion than other members of the class F;,;(z, ¢). In other words, for a finite 
range, Theorem 5 leads to a method of fitting which is more general than that 
provided by the usual methods. It may be possible in particular cases to choose 
a value of e which gives a better fit than F(z). 

The class F,,(x, ¢) of Theorem 5 may be extended further into the class 
F(z, €:, €, °** » €m) With | | + |e] +--+ +] €m| S 1 where 


F,(z, Se Bs *** s €m) = / (1 + > (e:Pasilt)/Lene:) ) JOO dt 


i=l 


provided the required number of moments, necessary for the existence of 
Pnim(X) exist. 

Finally, Theorem 5 proves the existence of an infinite class of continuous 
cumulative distribution functions which are solutions of a given reduced mo- 


ment problem over a finite range provided that there is at least one continuous 
cdf which is a solution to the given moment problem. 


REFERENCES. 
{1] P. L. Cuesycuerr, ‘‘Sur les valeurs limites des intégrales,’’ Journal de Mathématiques, 
(2), Vol. 19, (1874). 
[2] M. G. Kenpaui, The Advanced Theory of Statistics, Vol. 1, 2nd ed., revised, Charles 
Griffin & Co., Ltd., London, 1945. 
Markorr, “Démonstration de certaines inégalites de M. Chébycheff,’’ Math. 
Ann., Vol. 24, (1884), pp. 172-180. 
. A. Suonat anno J. D. Tamargin, The Problem of Moments, Mathematical Surveys, 
No. 1, American Mathematical Society, New York, 1943, reprinted 1950. 
J. Sreimrsyes, ‘‘Quelques recherches sur la théorie des quadratures dites 
mécaniques,’’ Annales Scientifiques de l’Ecole Normale Supérieure, (3), Vol. 1, 
(1884). 
J. Stie.tses, ‘‘Recherches sur les fractions continues,’’ Annales de la Faculté des 
Sciences de Toulouse, Vol. 8, (1894). 
. J. Steiirses, ‘‘Recherches sur les fractions continues,’’ Annales de la Faculté des 
Sciences de Toulouse, Vol. 9 (1895). 
. J. Strevtses, Oeuvres Complétes de Thomas Jan Stieltjes, Vol. 2, published by les 
Soins de la Societe Mathematique d’Amsterdam, Groningen, 1918. 
Szeai, Orthogonal Polynomials, American Mathematical Society Colloquium 
publications, Vol. 23, 1939. 
[10] J. V. Uspensxy, Introduction to Mathematical Probability, McGraw-Hill Book Co., 
1937. 





SPACING OF INFORMATION IN POLYNOMIAL REGRESSION! 


By A. pe LA GARZA 
Carbide and Carbon Chemicals Company 


Oak Ridge, Tennessee 

1. Summary and Introduction. The purpose of this paper is to investigate a 
problem in the spacing of information in certain applications of polynomial 
regression. It is shown that for a polynomial of degree m, the variance-cova- 
riance matrix of the estimated polynomial coefficients given by a spacing of 
information at more than m + 1 values of the sure variate can always be 
attained by spacing the same information at only m + 1 values of the sure 
variate, these spacing values being bounded by the first spacing values. The 
presented results are of use in experimental design involving polynomial re- 
gression when a choice of sure variate values is possible but restricted to a 
specified range. 

Let the polynomial under consideration be 


(1.1) P(x) = ay + age +--+ + Gimyit”, m = 1, 
and let 


P(x.) = y(z.) + &, € = 1AL)N, N 2 (m + 1). 


The y(z,.) are observed uncorrelated variates with random error 6, having 


mean zero and finite variance o¢ > 0. The z, are observed variates without 
error, there being at least (m + 1) distinct 2, . 

The following notation is introduced. Let # = (1,2, 2°, --- ,2"), X = (2), 
€ = 1(1)N, and let W be the N X N diagonal matrix with entry w, = 1/0? in 
the («, «) position. w, will henceforth be referred to as the “information”’ of 
y(z,.), andQ = yw. , € = 1(1)N, will be referred to as the “total information.” 
The matrix X’WX will be called the “information matrix.” 

The problem is to show that giver. a spacing of total information Q at loca- 
tions x,, ¢« = 1(1)N, N 2 (m + 1), there being at least (m + 1) distinct z, , it 
is always possible to re-space Q at (m + 1) distinct locations r; ,7 = 1(1)(m + 1), 
in such a manner that min x, S r; S max z,, ¢€ = 1(1)N,j = 1(1)(m + 1), 
and X’WX = R’UR, with R’UR being the information matrix of the re-spacing. 
The problem is solved by prescribing a method for finding the required U and R 
which determine the spacing of the total information. 

The motivation for the problem is as follows. In experimentation in the chem- 
ical engineering industry, we most often have control over our sure variates. The 
sure variate x could be the pressure level of our process equipment, and we would 
be permitted to choose any operating pressure x in the pressure range min z to 
max x, tolerated by our equipment. Quite often, and in particular with isotopic 
measurements, laboratory analytical determinations are required for our y- 


Received 1/7/53, revised 8/12/53. 
! Work performed under AEC Contract No. W-7405-eng-26. 
123 





124 A. DE LA GARZA 


variates with the laboratory being the major source of error. With each labora- 
tory determination having variance o’, we can request n, determinations on the 
material sample taken at sure variate x. Using the average of the laboratory de- 
terminations, the corresponding y variate has variance o’/n, . Specifying Q then 
amounts to specifying total laboratory effort expended on the experiment. It 
might be set by such usual factors as the dollar allowance on the experiment; if 
the material is highly radioactive, it might be set by such unusual factors as 
exposure time allowed the laboratory analysts. Furthermore, in experimenta- 
tion with fairly large equipment, it is important to minimize the distinct levels 
of operation, that is, the distinct number of sure x’s. The time required to make 
the change and to reach sufficient equilibrium representing steady-state opera- 
tion of the process is often long. In any case we lose time, and with production 
line equipment, we also lose production. These are the reasons for minimizing 
the distinct number of sure z’s in the experiment. The equivalence X’WX = 
R’UR gives the required minimization. If the functional relationship between 
y and z is adequately represented by a polynomial of degree m, the equivalence 
assures that only (m + 1) distinct sure z’s are required to maintain the same 
efficiency of statistical evaluation of the experimental results, since most statis- 
tical evaluation will require (X’WX)~', which can now be replaced by (R’UR)™’. 

It may be seen that such experiments, common in physico-chemical industry, 
present a formulation and require a mathematical model not found in ordinary 
regression theory, where usually it is not possible to assign various values to the 
corresponding y variances. 

With the indicated background in mind, the results of this paper find applica- 
tion in experimental design. The determination of a spacing which optimizes 
some criteria involving the information matrix is made simpler. A familiar 
example arising in point estimation is minimizing p(X’WX)'p’ for a specified 
row vector p. An example from interpolation is minimizing the maximum of 
e(X’WX)'¢’ with : = (1,t,  --- &") andminz, S — S max 2, ; the extrapola- 
tion problem is similar. 

The advantage of applying the above result to such problems is that the spac- 
ing of information is at once reduced to (m + 1) distinct locations, any larger 
number being unnecessary. The matrix X then is the matrix of a Vandermonde 
determinant. The properties of these matrices are well known and attractive. 
These uses will be illustrated by an example given in Section 4. 


2. Some useful relations. Prior to investigating the problem as outlined above, 
several relations needed later will be developed. First, a convention in sub- 
scripts: subsequently, small italic letters will run from 1 to (m + 1), and small 
Greek letters will run from 1 to N | Capital italic letters will run as indicated. 

With the notation of Section 1 and with @ = (a; a «++ a@m4:)’, the polynomial 
(1.1) under consideration is 


(2.1) P(x) = Fa. 





POLYNOMIAL REGRESSION 125 


Choose (m + 1) distinct numbers z;. From Lagrange interpolation it follows 
that 


P(x) = ( — 2a)(% — 2s)+++(@ — ems) P(z:) + (x — 2:)(a — @)+++(% — Smt) 


(21 — 22) (21 — 2s) ++ + (21 — 2m) (22 — 21) (22 — 2s) ++ * (22 — Zs) 


*P(z2) + +++ + @ = ai)(x =m): ++ @ — zm) P(2m41). 


(2mi1 — 21) (2me1 — 22)°** (Zmia — 2m) 


With an obvious notation, 
(2.2) P(x) = 0; F(a, 25)P(z;). 


Consider now that from (2.1) P(z) = Za, where P(z) = (P(z:) P(zz) «++ P(2ms1))’, 
z= (lz---2"),andZ = (z;). Thematrix Z is nonsingular, since its determinant 


is a Vandermonde determinant not equal to zero due to the z; being distinct. 
Thus, Z~*P(z) = &, and for any z, 


(2.3) #Z'P(z) = Ba. 
Since from (2.2 


(2.4) P(x) = F(z, z)P(z), 


with F(a, z) = (F(a, 2:1) F(x, 22) «++ F(x, 2m41)), it follows from (2.1), (2.3), and 
(2.4) that 


(2.5) #Z"P(z) = F(z, z)P(z). 
Equality of (2.5) for any z implies 
(2.6) #Z = F(z, z). 


3. Investigation of the problem. The problem as stated in Section 1 is now 
investigated. From the results of Section 2, it may be shown that without loss 
in generality the range of the variable x may be limited to min z, = —1 and 
max z, = 1. Another simplification follows. Suppose that some of the x, are not 
distinct. Say that 2; = 2, = +--+ = 2, with corresponding information uw , 

- , wx. It may be verified directly that the information matrix for uw, , 
We-**, We, Wei1,'** Ww at M1, Ge-*+ Oe, Ley °** Ly is the same as the 
information matrix for (w, + we + +++ + Wx), Wey, *** Ww at Oe, Tea, 

-» gy . Such a grouping can be made for all z, not distinct, thereby reducing 
the problem to considering only distinct z, . Finally, for N = (m + 1), there is 
no problem since the information already is at (m + 1) locations. 

The problem may now be re-stated as follows. It must be shown that given a 
spacing of total information Q at N distinct locations z,, « = 1(1)N, N > 
(m + 1), with min z, = —1, and max z, = 1, it is always possible to re-space 
Q at (m + 1) distinct locations r;, 7 = 1(1)(m + 1), in such a manner that 
—1 Sr; 3 1,and X’WX = R’UR. 

Suppose that FR exists. Then, 


(3.1) (XR")W(XR™") = U. 





126 A. DE LA GARZA 


Reference to (2.6) shows that the off-diagonal elements of (XR™)/W(XR™) are 
proportional to 


(3.2) Cn = D. we II, (x — 15), g #h,p #9,p #h, 
with ¢ = IL (xz, — r,). Since in (3.1), U is diagonal, it is required that c,, be 
zero for g ¥ h. This requirement is satisfied if the r; are determined such that 
(3.3) > ud = 0, dw. oa, = 0,--:, Dow gat = 0. 

For reasons that will be discussed later, further constrain the r; by 

(3.4) dw. 2. = 0. 

By direct expansion, 

(3.5) be = Bi + Bote + +++ + Byte + 20". 

Hence, the r; are the (m + 1) roots of the polynomial 

(3.6) P(r) = By + Bor + ++ + Bmgir™ +r", 

Substituting (3.5) in (3.3) and (3.4), there results 


ib cde seh ge! Be 
| fi fe dea a Juat 


Sot fa  ***fent Sam | 
\ fm Smtr °° *Som—t fom } 


with 
fu. = dwat, L = 0(1)(2m + 1). 


(3.7) is a linear system of (m + 1) equations in (m + 1) unknowns. The square 
matrix is X’WX, which is nonsingular, and hence (3.7) has a unique solution, 
8; . This solution is not trivial, since it is readily seen that some f, , (m + 1) S 
L S (2m + 1), is not zero. The corresponding r; are then given by (3.6). 

Thus, a method of determining the r; that satisfy (3.3) and (3.4) has been 
prescribed. Accordingly, these r; make c,, zero in (3.2) as required. It will now be 
shown that they are real and distinct. 

That the r; are real is shown as follows. Since they are the roots of the poly- 
nomial (3.6), complex r;, if any, must occur in conjugate pairs. Say that r; = 
b; + ba and rz = b; — ba , with b, ¥ 0, the nature of the remaining roots being 
unspecified. Since in (3.2), cy is then zero, it follows that 


(3.8) Soewd(e. — bi)? + bie. — 15)* +++ (te — rm)*(te — Tm41)” = 0. 


Note that all factors in each term of (3.8) are nonnegative, and therefore, 
equality to zero implies that all terms must be zero. But [(z, — b,)’ + b3] with 
be ¥ O never vanishes, and (x, — 13)” «++ (a — tfm)*(% — fm41)° can vanish for 
at most (m — 1) distinct x, . Since there are at least (m + 2) distinct x, , it fol- 
lows that (3.8) cannot be zero, and hence r; and r2 are not complex. The argument 





POLYNOMIAL REGRESSION 127 


is the same for any other pair of roots, and hence, all the r; are real. That they 
are distinct is shown similarly. Suppose r; = r, = b; , and again form ¢ , which 
is now given by (3.8) with b, = 0. The arguments are as before. The terms can 
now vanish for at most m distinct x, , but since there are at least (m + 2) dis- 
tinct x, , (3.8) with b. = 0 cannot be zero. Hence, all the r; are distinct. 

Since the r; are distinct, it follows that the matrix R is nonsingular, and that 
Con being zero in (3.2) implies that (XR™)’W(XR™") = U is a diagonal matrix. 
Since both X’WX and R are nonsingular, it follows that no diagonal element 
of U is zero. Further reference to (2.6) shows that the diagonal elements of U are 


(3.9) Ur, = » We Tr 2: an Gand, 
and hence, all u, are positive. 

Thus, with U given by (3.9), X’‘WX = R’UR. Since the (1,1) element of 
X’WX is the total information Q = du. , and since the (1,1) element of 
R’'UR is Sou; , it follows that Q = }°u;. Inasmuch as it has been shown that 
u; > 0, the u; may be considered a re-spacing of the total information Q at loca- 
tions r;. 

To complete the solution of the problem, it remains to show that —1 Sr; S 1. 
Suppose that the r; are such that two or more of the r; are not in the closed 
interval [—1, 1]. Say that r; and r, are not in this interval. Since cy in (3.2) must 
be zero, it follows that 


(3.10) su wWel(Xe — T1)(te — 72)(Xe — Ta)” «++ (Xe — Pm) (Xe — Tm)” = 0. 


Consider that (zx, — 1:)(a%_ — re) never equals zero and always must have the 
same algebraic sign. Furthermore 


(3.11) (x. — r3)° _* (a, = rm) (Le = mi)” = 0. 
Hence equality to zero in (3.10) implies that all terms must be zero. But 

(x. be 11) (2X. at To) 
can never vanish, and (3.11) can vanish for at most (m — 1) distinct x, . Since 
there are at least (m + 2) distinct z, , all terms in (3.10) cannot vanish. Thus, 
two or more of the r; cannot be excluded from the closed interval [—1, 1], and 
hence, it has been shown that m of the r; are in the closed interval [—1, 1]. 


Consider now the polynomial in (3.6) whose roots are the r; . In determinant 
form this polynomial may be shown to be 


1 r ee Je? 
apn |B he 


P(r = 
V(r) A 


: fom 


where A = | X’WX |. 





128 A. DE LA GARZA 


Evaluate @(r) atr = —1l andr = 1. It will be seen that, for J = —1, 1, 


fo — Jfi fi — Jfs 98 In — Sfasi 
fi ~ Jf fa — Jf; “— Sm+i —_ Jf mia 


pgees 
(3.13) @(J) = a 


In ~¢ I fmt Sm+i 7h Jf m2 x Som — J fom+t 


Remembering the definition of f, , the elements of the indicated determinant 
| H,| in (3.13) are of the form }>.w(1 — Jx.)xt , L = O(1)2m, which shows 
that 


(3.14) H, = X'W,X, 


where W, is the N X N diagonal matrix with diagonal elements w,.(1 — Jz,). 
Since min z, = —1 and max z, = 1, it follows that W, always has one diagonal 
element equal to zero, all others being positive. Hence, from (3.14) it follows 
that H, = X,V,X,, where V, is the (V — 1) X (N — 1) diagonal matrix 
formed from W, by striking out the row and column corresponding to min z, 
for J = —1 and max z, for J = 1; and X, is the (V — 1) K (m + 1) matrix 
formed from X by striking out the row corresponding to min z, for J = —1 
and max x, for J = 1. Since there are at least (m + 2) distinct x, , X, has rank 
(m + 1). Also, since V, is diagonal with nonzero diagonal elements, V, is non- 
singular. It follows that the characteristic numbers of H, are all positive and 
hence, | H,| > 0. Then, since A > 0, the following conclusions can now be made 
concerning @(J) in (3.13): 


@(1) > 0, and @(«) > 0, 
(3.15) °(—1) > O and @(— @) > O for odd m, 
@(—1) < O and @(—«) < 0 for even m. 


It was previously shown that m of the roots r; are in the closed interval 
[—1, 1]. (3.15) shows that —1 and 1 cannot be roots, and hence, it may be stated 
that m of the roots are in the open interval (—1, 1). Furthermore, knowing the 
sign of @(r) for r = —1, 1 and for sufficiently large values of |r|, it may be 
reasoned that all r; are in the open interval (—1, 1) for one exterior root would 
imply another. 

In conclusion, it has been shown that for N > (m + 1), —1 <r; < 1. As 
indicated in the preliminary discussion, for N = (m + 1), —1 S r; S 1. Hence, 
—1 sr; S 1 holds for all cases, and the solution of the problem is complete. 

In summary, the r; are the roots of the polynomial (3.6); the polynomial 
coefficients 8; satisfy the linear system (3.7). The information located at each r; 
is given by (3.9). 

Two closing remarks are in order. Returning to (3.2), it may be noticed that 





POLYNOMIAL REGRESSION 129 


the constraints (3.3) are sufficient to make U a diagonal matrix. The added 
constraint (3.4) is sufficient to locate all r; in the closed interval [—1, 1]. To see 
this, suppose (3.4) is not imposed. This is equivalent to striking out the last 
row of X’WX in (3.7), leaving a linear system of m equations in (m + 1) un- 
knowns. The rank of this system is m, and hence, 8m; can be chosen at will. 
Now, U being diagonal, that is, (3.2) vanishing, demands that m of the r; be 
in the closed interval [—1, 1]. Accordingly, since Basi = — >.r;, by choosing 
| Bn41 | sufficiently large, a root can always be made exterior to [—1, 1]. 

It is of further interest to note that the results of this paper suggest an op- 
timum spacing characteristic; namely: max r; — min r; S max z, — min z,, 
the equality being necessary only for the trivial case N = (m + 1). Thus, with- 
out triviality, the same information matrix can be attained by a lesser number of 
locations in a shorter interval. 


4. An application. The application of the results of this paper to the problems 
listed in Section 1 will be indicated by considering an interval interpolation 
problem for the quadratic. 

Let m = 2, and permit N independent observations y(z,), « = 1(1)N, of equal 
variance o° to be taken in the specified interval 2, S x. S ry. Let Y(é) be the 
least squares estimator of P(). The problem is to find the spacing of the N 
observations that will minimize the maximum variance of Y() forz, S § S ra. 

The variance of Y(é) is o Y(¢) = e(X’WX)"?’. Whatever the optimum spac- 
ing be, it will give rise to some matrix X’WX;; let this be (X’WX), . From the 
given results, it follows that there exists a matrix R’UR = (X’WX),. Since 
m = 2, 


ri\ ‘ny 0 0 
r;) andU = 0 m O i 


’ 
o 
2 


T3/ 0 0 Ns 


implying n; observations at r; satisfying x, S rj; S tw, and Don; = N,j = 
1(1)3. Hence, three locations suffice to establish the desired optimum spacing. 

Let 9; be the average of the n; observations at r; ; let 7 be the column vector 
(J: J Hs)’; and let a be the column satisfying the normal equations R’URa = 
R’Un. Since R and U are nonsingular, Ra = 7. Since Ra is the column vector 
(Y(r:) Y(re) Y(rs))’, Y(€) passes through 9; at = r;. Hence, Y(£) may be written 
in the Lagrange form of Section 2, 


(41) y@ = Some —w 5, E- WE - 1) Qs + 


(1 —r)(1— 1s)” (% — ry) (2 — 1) 


(é - r)(& = a 


(rs — ni)(rs — 72) 


It follows that o’Y(r;) = o’/n;. For any such spacing, let omax be the maximum 
variance of Y(é) in the interval. Then, 


2, 2 
ao /n; Ss Tmax » 





130 4. DE LA GARZA 


and thus, 


oe 1 

2 
a> poe S Tmax: 
3 ‘T Nn; 


(4.2) 


wr i 3 ° 3 os 
The minimum value of >0i1/n; constrained by >-j n; N is found to be 
9/N with n; =.N/3. Hence, from (4.2), 


(4.3) 30°/N S min omex . 


Note from (4.1) that o’Y(£) increases as ¢ departs from the smallest and the 
largest r; in direction of leaving the interval x; , zg . Hence, locate N/3 observa- 
tions at 2, and zy . Note that o’ Y(é) has one differentiable maximum occurring 
in the interior of the interval. Locate N/3 observations at (x, + zq)/2. From 
symmetry, the differentiable maximum then occurs at (x, + 2q)/2. Hence, the 
maximum o Y(£) for z, S & S 2x for the spacing r; = 2, , T2 = (tz + 2x)/2, 
ry = tn, n; = N/3 is 30°/N. The inequality (4.3) assures that this particular 
spacing gives the desired minimum maximum variance, and that this variance 
is 30°/N, for the equality has been produced. 

This conclusion is directly applicable for N divisible by 3. For small N not 
divisible by 3, a fine structure study, using 30°/N as basis for comparison, will 
indicate an acceptable spacing with little increase in variance. 





GENERALIZATION OF THE THEOREM OF GLIVENKO-CANTELLI 
By J. Wo.Lrow1Tz 


Cornell University and University of California at Los Angeles 


Let X, , X2, --+ be independent chance variables with the same distribution 
function F(x). (F(x) is the probability that X, < z.) The “empiric”’ distribution 
function F%(r) of X,, --- , X, is given by 


(1) Ft(z) = +> y,(Xd, 


% tot 
where 


Thus F*(zx) is 1/n times the number of X,, --- , X, which are less than x. We 
define the distance 5(G, , G.) between the two distribution functions G; and G, as 


(2) 5(G, , G2) = sup, | Gi(x) — Geax) 


Let P| |} denote the probability of the relation in braces. The theorem of Gli- 
venko-Cantelli (see, for example [1], page 260) states that 


(3) P{lim,.. 6(F(x), F%(x)) = 0} = 1. 


Let Y = X},---,Xi, Xi, ---,X, ---, ad inf. be a sequence of independent 
chance variables such that X;, X:2,---, ad inf. have the same distribution 
function (say F,(x)),7 = 1,---,k. Let q:,t = 1,---, k, be real parameters. 
We shall prove the following generalization of the theorem of Glivenko-Cantelli. 

TuroreM. Let q = (q:,°°* , ge). Let F(x\q) be the distribution function of 


Dim: QiXi. Let F*%(x | q) be the empiric distribution function of 


(Diet 9:X)), 
Then 


(4) P{lim,..» sup, 6(F(x | q), Fa(z|q)) = 0} = 1. 


This stronger version of the Glivenko-Cantelli theorem will prove useful in 
mathematical statistics for the purpose of estimating unknown distribution 
functions. We have already made use of essentially our result in [2], [3], and [4]. 

For typographical simplicity we shall carry through the proof for k = 2, and 
leave to the reader the easy verification of the fact that the method is valid for 


Received 9/9/53. 





132 J. WOLFOWITZ 


all k. It is easy to see that, when k = 2, we may, without loss of generality, 
take gq. = 1. We write q. = p. Thus g = (p, 1). 

Lemma 1. Let A, n, and « be positive and Q be any number. There exists a positive 
integer N(e, , Q) (which is a function only of the variables exhibited) such that 


P{5(F(x | p), Fa(x| p)) < 1 + 2M(AH), 
(5) n= N,N + 1,+--, ad inf., 
lp-Q| SA} >1-« 


where H is any positive number such that H and —H are both points of continuity 
of F,(z), 


F\(H) — F\(—H) > 1 - 5° 


M(v) = sup, | F(x) — F.(x — v) |. 


Proor. From the theorem of Glivenko-Cantelli we obtain that, for some 
No(e, n; Q), 


) 
(6) P {s(F(z| Q), Fie |Q) <2, n = No, No+1,-++,adinf.) > 1 — <. 
\ . / « 


From the strong law of large numbers we have that, for some Nj(e, 7), 


fi¢ 1 1 
Pyn 2 [Wa(Xi) — piw(X)) > 1- 3? 
(7) + 


\ 


n= Ni,Ni+1,+++,ad inf.) >1—<. 
/ 


Thus the probability of the event 


{ote | Q), Fa(z|Q)) < zn 2d (Wa( Xi) — v-w(XD) > 1 a 
(8) 
n = N2,N2 + 1, ---, ad inf. > 


exceeds 1 — e¢, where N. = max (No, N;). The event whose probability is 
bounded in (7) in conjunction with |p — Q| Ss A, implies 


(9) Paz — JH|Q) — 3S Fa(z|p) S Fale + AH|Q) +3 
forn = N,, N, + 1,---, ad inf. The event whose probability is bounded in 
(6), together with (9), implies 


(10) F(x — MH \Q) — 2s Fuc|p) S Fle + 4H |Q) +2 


forn = N,,N2 + 1,°--- , ad inf. 





THEOREM OF GLIVENKO-CANTELLI 133 


From the formula for a convolution we obtain immediately that for any p 
and any x 


(11) | F(a — AH |p) — F(z| p)| S M(AR). 
Hence (10) implies 


(12) F(x|Q) —= — M(AH) S FR(x| p) Ss F(x | Q) +2 + M(AH) 


forn = N2,N2+ 1,--- , ad inf. 


Finally we consider 
(13) 6(F(x| Q), F(x| p)) = sup. | P{QXi + Xi < 2} — P{pXi + Xi < 2} J. 
We have 


P{pXi + Xi < x} = P{QXi + Xi+ (p — Q)¥i < 2} 
(14) : 
< PiQXi + Xi<2x+ AH} + 3 <s P{QX} + X? <2} + ; + M(AH). 


Similarly 
(15) P{pXi + Xi < 2} 2 P{QXi + Xi <2} — 3 — M(ak) 
(12), (14), and (15) imply 


(16) (F(x | p), Fa(z| p)) S » + 2M(AH). 
This proves Lemma 1 with N(e, n,Q) = N2. 
Lemma 2. Let ¢ and n be arbitrary positive numbers. If F\(x) is continuous there 
exist positive functions K(e, n) and N(e, ) such that 
P{5(F(x | p), Fa(x| p)) < 9,n = N,N +1, +--+, adinf.,|p| 2 K} 
>1l—e. 
Proor. Since F;(x) is continuous it is uniformly continuous. Let h be such 
that | 2; — 23| S h implies | Fy(x,;) — Fi(x2) | < 9/10. Let Kyo > 0 be such that 
F.(Ky) — F2(—Ko) > 1 — 7/10, and Ky and — Ky are both points of continuity 
of F,(x). Now, if |p| > Ko/h, then 
| P{pXi + Xi < 2} — P{pXi < x} | 
(18) 


For N;, sufficiently large we have 


( 
(19) Pi8(F,(z), Ff,(2)) < 0°” = Ni,M+1,-"°, 
\ 





134 J. WOLFOWITZ 


where FT,(x) is the empiric distribution function of X}, Xi, --- , Xi. Ob- 
viously 


(20) 8(F,(z), F%,(z)) = 6(F ),#t, (2)). 
Pp 


(18), (19), and (20) imply that 


— 7 
10 


f 3 ) ‘ 
(21) P43( Fx| p), Fh (=)) <n = Mi,Ni+1,---, ad inf.) >1—- = 
\ p ) 
From the strong law of large numbers it follows that, for N2 sufficiently large, 


Pin” > (Wx, (Xi) —_ vn, (XD) >1- = 


i=l 


(22) 
) 
n =: Nz, Nz + 1,-++, ad inf.) > 1 — 5 


j — 


Now p > Ko/h together with the event whose probability is bounded in (22) 
implies the event 


‘ Ft, (z= =) —~"< F%(xz|p) S Ft (G+) +2, 
(23) . . ; 


n = N2,N: + 1,-:-,ad inf.> 


4 


From (19), (20), (22), (23), and the definition of h we obtain, with N; = max 
(Ni, No), that 


P< Fin () —=<S Fi(t|p) S Fin (2) +2, 
Pp Pp 2 


(24) 
i ; Ko\ 
n = N3,N3 + 1,---, ad inf., p > 7? >l-—e 

The same result obviously holds for p < —Ko/h. From (24) and (21) we obtain 
the desired result with A(e, 7) = Ko/h, and N(e, 7) = N;. 

Lemma 3. Lemma 2 holds even when F(x) is not continuous. 

Proor. Let d;,d:,--- be the (necessarily denumerable) points of discon- 
tinuity of F;(x) and let ¢; be the saltus of F(x) at d;, i = 1, 2, --- , ad inf. Let 
r be such that 


(25) > ¢, em. 


r+1 10 


ys > Wy : a yf ° ° . 

Write Fi(x) = Fy(x) + F.(x), where F;(x) is continuous and nondecreasing, 
/ . . . . . . . 

F(x) is a nondecreasing step-function with saltuses of size ¢; at the points d,, 





THEOREM OF GLIVENKO-CANTELLI 


-- , ad inf., and Fi(—«) = F3(—«) = 0. Define 


tp =l1- Doms ti 


Gf = t 
i= 1l— dint 


We shall assume that p > 0 and & > 0. The modifications needed in the proof 
below when p < 0 and/or } = 0 will be obvious. Let W;, ,j = 0,1, ---,r+ 1; 
n = 1,2, --- , ad inf., be chance variables distributed independently of each 
other and of the elements of Y, with distributions given by the following (for 
all n): 


1 , 
P{Wo. < 2} = it F(x) 
0 


P{W, = di} = 1 
P{W wasn = 0} = | 


Let Z,,n = 1, 2, --- , ad inf., be (independently distributed) chance variables 
defined for all n by the following: 


P{Z, = Win|Z1, °°: , Zn} =@ (¢ = 0,1, ---,r+ 1). 


For all positive p and all positive integral n we define chance variables Z,, by 
Zon = pZ, + X;,. Write V(zx | p) for the distribution function of Z,, . We have 
immediately from (25) that 


(26) 5(F(x | p), Vic|p)) < 7H" 


Let ny(n), (¢ = 0, 1, --- , 7 + 1) be the number of indices j for which Z; = 
Wi;,j = 1, +--+ ,n. From the strong law of large numbers it follows that, for 
any positive « and 7 and for some N’(e, n) large enough, 


f * 7 . 

4 i oi Og anf 1 2? = 0, pee ’ 
: P| vslm) “1<s5¢ pm"? 1 r+1 
27) 


n = N’,N’+1,---,ad inf.) >1— 5° 


Write F’? (x) for the convolution of (]~"F; (2) with F(x). Let H;(x | p, nyi(n)), 


;} = 0,1,---,r + 1, be the empiric distribution function of those Z,;, 7 = 
1, --- ,n, for which the corresponding Z; equals W;;, 7 = 1, --- ,n. (The 
saltuses of H; are integral multiples of (ny;(n))~*.) From (27), Lemma 2, and the 
theorem of Glivenko-Cantelli we conclude that there exist N”(e, 7) and K(e, 7) 
such that the probability exceeds 1 — ¢ that the following events will all occur 





136 J. WOLFOWITZ 


for every n 2 N”(e, n) and every p > K(e, n): 


a= f iain Ll - iis 
(28) | yi(n) “1 <s0GFT)’ i = 0, 1, r+ 


' ( (n)). PU (x) ” 
(29) 6(Hol2 | p, nvo(n)), Fo(=)) < TEETH 


and 


a(H (x An) (x — pd: eS le pam Liceeie, 
(30) H (x | p, nyi(n)), F2(x — pdi)) <ioe +) i= 1 yr 
(We get (29) from Lemma 2, because F;(x) is continuous. We get (30) from the 
Glivenko-Cantelli theorem applied to F,(x), because the role of the W;, for 
i= 1, --- ,7,is to supply an additive constant which merely translates the dis- 
tribution functions.) When (28) holds we have from (25) 


| r 
(31) sup | F%(x| p) — d vi(n)H ila | p, nyi(n)) | “7 


i vo 


Also (29), (30), and (31) imply 


(32) sup | Fx(x\p) — a yi(n)F(x — pd) — yo(n) F(z) |< o. 


From (28) and (32) we obtain 


(33) sup | Ft(c|p) — X ttP(e — pd) — F%(2)| < zt. 
z tl 
From (33) we obtain 


(34) 5(F%(x | p), V(x | p)) < =. 


Finally (26) and (34) yield 
(35) 6\F(x| p), Faz |p) < o 


This completes the proof of Lemma 3. 

Proof of the Theorem. First suppose F;(x) is continuous. For given 7 and e let 
K(¢/2, ) and No(«/2, 7) be functions for which Lemmas 2 and 3 hold for 7 and 
«/2. Choose H as in Lemma 1, and choose A > 0 sufficiently small so that 
2M(AH) < n/2. (The latter can be done when F;(zx) is continuous.) Define 
[A~'K(¢/2, »)] as the smallest integer => A~'K(¢/2, n). Let Q, be defined by 


(36) Q,=K (,) — (2i — 1)A. 


As in Lemma 1 choose N;, i = 1, --- , [A ’K(¢/2, »)], so large that 
P{a(F(x |p), Fa(z|p)) <1,|p — Qi| <A,n = Ni, Ni + 1,---, ad inf} 


(37) e[ K(e/2, ») T° 
> 1 g[EemaT" 





THEOREM CF GLIVENKO-CANTELLI 137 


(In the notation of Lemma 1, one can take N; = N(¢€/2[K(¢/2, n)/A]', »/2, Q,), 
since (n/2) + 2M(AH) < n). Let No = max {N,},i = 1, --- , [K(€/2, n)/A). 
Therefore, for 


(38) N* = max {N)(e/2, n), No} 
we have 
(39) P{s(F(x| p), Fi(a|p))<21,-2 < p< a, 
n = N*,N*+1,---,adinf.} > 1 —«. 


This proves the theorem when F;(x) is continuous. 

To prove the theorem for the case when F;(x) has discontinuities, proceed as 
in Lemma 3. Except for a probability sufficiently small so that it can be ignored, 
F.(x) consists of a continuous part and a step-function with a finite number of 
saltuses. We have already proved the theorem for the continuous portion. When 
the Xi, i = 1, --- , , assume one of the values at which a saltus occurs, the 
effect is simply to translate both the distribution function and the empiric dis- 
tribution function. In this case the Glivenko-Cantelli theorem already gives the 
desired result. Thus the theorem is proved when F,(x) is discontinuous and our 
proof is complete. 

The underlying ideas of the above proof are the following: 

A) When | p | is a large number and F;(z) is continuous the variables {Xj} play 
a small role in determining 5(F(x|q), F%(a|q)) (Lemma 2). This is made 
plausible by the following fact. Let J(x|q) be the distribution function of 
Xi +p ‘Xj, and J%(z | q) be the empiric distribution function of {X} + p™'X%}, 
j =1,---,n. Then 


6(F(a | q), Fa(x\q)) = 6(J(x|q), Ja(x] q)). 


B) The discontinuities in F;(x) act essentially to displace the distributions 
laterally and the distance is left invariant (Lemma 3, especially equation (30)). 
Hence, when | p | is large, say greater than a suitable number L*, the variables 
{X3} play a small role in determining 6(F(x|q), F%(x|q)), whether or not 
F(z) is continuous. 

C) The theorem is true when p varies in a small interval (Lemma 1), essen- 
tially because of the Glivenko-Cantelli theorem. 

D) The theorem is therefore true in general, because the interval —L* S p S 
L* can be subdivided into a finite number of small intervals, for each of which 
C) holds, and the case | p| > L* is taken care of by B). 

These considerations show that our theorem holds with essentially the same 
proof under hypotheses much weaker than those we have stated. We shall con- 
tent ourselves with indicating just a few possible generalizations: 

a) The chance variables {X}} (i fixed, 7 = 1, 2, --- , ad inf.) need not be 
independent of each other. If, for example, for each 7, {Xj} is a metrically 
transitive stationary sequence of chance variables, the Glivenko-Cantelli 





138 J. WOLFOWITZ 


theorem will hold and so will our generalization of it. (As an example, see [4],' 
equation (6.3).) 


b) Xi and Xj need not be independent, provided the dependence does not 
prevent B) and C) from holding. (As examples, see [2], Lemma 1, [4], equation 
(5.11), and [4], equation (6.10).) 

c) The chance variables may be vectors and need not be scalars. (As examples, 
see [4], equations (5.11) and (6.10).) 


REFERENCES 

[1] M. Frécuer, ‘‘Recherches théoriques modernes sur la theorie des probabilités,’’ Gau- 
thier-Villars, Paris, 1937. 

(2) J. WoLrowirz, ‘‘Consistent estimators of the parameters of a linear structural rela- 
tion,”’ Skand. Aktuarietids., Vol. 35 (1952), pp. 132-151. 

(3) J. Wo_rowr7z, ‘Estimation by the minimum distance method,’’ Ann. Inst. Stat. Math., 
Tokyo, Vol. 5 (1953), pp 9-23. 

[4] J. Wo_rowrrz, “Estimation by the minimum distance method in nonparametric sto- 
chastie difference equations,’’ to appear in Ann. Math. Stat., Vol. 25, No. 2 (1954). 

! The reference here is to a paper which it had been hoped to publish in the present issue 
of the Annals but which will appear in the next issue. 





ON LEHMANN’S TWO-SAMPLE TEST 


By R. M. Sunprum 


Division of Research Techniques, London School of Economics 


Summary. This paper considers some properties of a two-sample test, sug- 
gested by Lehmann [2], against general alternatives. Alternative expressions are 
given for the test statistic; a general formula for the variance is derived and 
evaluated for the null case; the expectation is obtained in certain nonnull cases; 
and the exact distributions in the null case are tabulated for some small! samples. 


1. Introduction. A statistic for testing the null hypothesis that two inde- 
pendent random samples come from the same population against general alter- 
natives (subject only to continuity of distribution functions) was proposed by 
Lehmann [2], based on the following lemma: 

LEMMA (4.1 of [2]). Let X, X’; Y, Y’ be independently drawn from populations 
with continuous cumulatives F, G respectively, and let us denote for any random 
variables U, U’; V, V’ the event max (U, U’) < min (V, V’) by U, U’ < V, V’. 
Then 


P((X, X’ < Y, Y’) + (Y, Y’ < X, X’)) 


, 9 p G 
4+2f«-ara(Ft%), 


and hence p attains its minimum value 44 if and only if F = G. 

We can then base a test of the null hypothesis on a statistic which is a sample 
estimate of this probability p and test in the usual manner whether this sample 
estimate is significantly greater than 14. For example, given a sample of X’s and 
Y’s, say of 2n observations each, we might choose n nonoverlapping quadruples 
at random each containing 2 X’s and 2 Y’s, and consider as our statistic the 
observed relative frequency of quadruples in which both X’s are on the same 
side of both Y’s. This procedure however appears to be wasting information. 
Lehmann has therefore suggested that it is more reasonable to consider the rela- 


tive frequency of such quadruples among all the (7)(3) possible quadruples 


that can be drawn from a sample of m X’s and n Y’s. 


2. Alternative expressions for the test statistic. For practical purposes, Leh- 
mann has given the following expression for the test statistic, which we denote 
by L. 


Received 5/18/53. 





R. M. SUNDRUM 


m 


—1 —1l / m 
L=4(7) (5) ((m — 1) 2) Ri — 2(m +n — 2) Dik; 
~ \ 


t=) t=l 


(1) ~(n -2n +1) oR, + OTB tem + 1)(2m + 1) 
t=l 


\1 


+ 4m(m + 1)(m + n? — 3n + 1) — mn(n — 1)$ 


({2], p. 174) where R; is the rank of the ith ordered X-observation in the combined 
sequence of the (m + n) members of the sample. 

To see the structure of this statistic more clearly, write for the sample variance 
of the ranks R; 


Sp = I > (R; — R)* where R = “DR 
t=1 
and for the sample “covariance” of i and R; 


cmt Se (s-™t*)p. 


M toni 2 


Then, ignoring constant additive and multiplicative terms from (1), we have 


2 
(2) L’ =m(m—1) (R oe tt ‘) + m(m — 1) Si — 2m(m + n — 2)C. 


The test statistic has thus three components; the first term depending on the 
average location of the X’s in the combined sequence, the second term depending 
on the dispersion of the R;’s and the last term depending on whether the X’s are 
evenly spaced out as they should tend to be under the null hypothesis. 

Alternatively, let (yxy) denote the event that when one X and two Y’s are 
drawn independently from the respective populations, the X-value lies between 
the two Y-values; and let (zyx) denote the same event with X and Y inter- 
changed. Then it follows quite simply that 


(3) p = 1 — Plyry) — P(xyz). 


Corresponding to the estimator L of p, we may consider as estimators of P(yzy) 
and P(xyz) the relative frequencies L; and Lz respectively of the specified events 
among all possible triplets that can be drawn from the sample. In terms of ranks 
we have 


(4) lL, = 2 Tis (R; — i)(n + i — R;)/mn(n — 1) 
(5) In = 2 Ph (S; — i)(m + i — S;)/mn(m — 1) 


where S, is the rank of the ith ordered Y-observation in the combined sequence 
of the (m + n) members of the sample. It can then be shown that 


(6) L =|]— Ly a Ls 


't The last term is omitted in Lehmann’s formula. 





TWO-SAMPLE TEST 141 


for any sample. This gives us an expression for the test statistic L which is 
symmetrical in X and Y, and somewhat more convenient for practical use. 
3. Expectation and variance of L. Let 
Di, j,k, = 1if X;,X;< Ye, Yior Yr, ¥i< Xi,X; C¥Ik eV 
= 0 otherwise. 
Then 


(7) (7) (G)L- CULE dw sis, 1) Gi <jjk<D 


‘ 7 


consisting of 4) (3) terms. Therefore 


(8) E(L) = E(D(i,j;k, D) = p = P(X, X’ < Y, Y’) + (Y, Y’ < X, X’)). 


In the null case, when F = G, we have p = \% from the above lemma of Leh- 
mann, or from the consideration that of the six possible arrangements in order 
of magnitude of the members of a single quadruple, all equally probable under 
the null hypothesis 


TUYY,TYTY,TYYTYrTTY,yryrtyyryz, 


in two arrangements only do both X’s lie on the same side of both Y’s. 
Further, from (7) 


(9) (") (5) L} = {x x L » Dii, j; k, } Gi <j;k <I 


2 2 
oa m\'(n . . 4 , 
consisting of (™ ) (") terms which can be grouped in the following nine classes 


of terms, involving the expectation terms shown against each class 
i : 2 Number of terms. 
"erm xrpectation m\(n\ ,-; 
, | , 9 ) times 
ec eeneoeeneenaneennenanenaneannnse antiiadenesel 
D*(i, 7; k, D p | 1 
D(i, 7; k, DDG, m; k, Y) | | 2(m — 2) 
Di, j;k, DDG, j,k, f) | | 2(n — 2) 
Di, j;k, DD(m, nz kD | 4(m — 2)(m — 3) 
D(i, 7; k, DD, 7; f, g) 4(n — 2)(n — 3) 
Di, 7; k, DDG, m; k, f) ) 4(m — 2)(n — 2) 
Di, 7; k, ) Dim, n; k, f) (m — 2)(m — 3)(n — 2) 
D(i, 7; k, DD, m; f, @) (m — 2)(n — 2)(n — 3) 
Di, 7; k, DD(m, n; f, g) Y4(m — 2)(m — 3)(n — 2)(n — 3) 





(i, j, m, n all different, k, l, f, g all dif- 
ferent.) 





142 R. M. SUNDRUM 


Collecting terms together and simplifying, we get 


m\(n\ » . ge “Gal” 
(” (3) o(L) = (a — p)mn + (b — p)mn 


+ (4v + 6p’ — 5a — 5b)mn + (ft + 3p” — 2a)m’ 
+ (hu + 36p° — 2b)n® + (2r — 56t + 10a + 6b — 8v — 155p")m 


(10) : 
+ (28 — 59u + 6a + 10b — 8v — 156p°)n 


+ (p + 3t + 3u + 16v + 9p’ — 4r — 48 — 12a — 12b). 


For evaluating the parameters occurring in the above expression, it is con- 
venient to express them in terms of the probabilities of certain ordered arrange- 
ments of a given number of X’s and Y’s drawn at random from the respective 
populations. In the following, we extend the notation of Section 2 and denote 
by expressions like, for example, (ayzy) the event that when two Y’s and two 
Y’s are drawn at random and arranged in order of magnitude, they have the 
indicated arrangement. 


p = P {(axyy) + (yyrx)} 
r {(xxaryy) + (yyrrx)} 
{(xaxxyy) + (yyxuxx)} + WP(rryyrx) 
(xaxyyy) + (yyyrrx)} + 26P {(rxyxyy) + (yyxyrx)} 
{(xaxxyyy) + (yyyrrxx)} 
+ MP {(xxxyryy) + (yyxyrrx) + (xxyyyrn)} 
+ 16P |(rxyrxyy) + (yyxxyxx) + (xxyyxry) + (yxxyyrr) 
+ (xayyxyx) + (xyxyyxx)} 
+ VUsP {(ryryrry) + (yrxyxryxr) + (yxxyxry) + (xyxryryz)}. 


Similar formulae for s, u and b can be derived from those for r, ¢t and a by inter- 
changing z and y. 

These probabilities can be evaluated very simply in the null case from the 
property that all permutations of the ordered sequence of x’s and y’s are equally 
probable. Then 


v = 1h6o; 


Substituting these values in (10), we find for the null case 


4{(m + n)(m + n — 1) — 2} 


(13) o (L) 


45mn(m — 1)(n — 1) 
and when m = n, 


8(2n + 1) 


' L) © 
(14) — 45n?(n — 1) 


The expectation of L can be obtained in certain nonnull cases by the use of 


(3). 





TWO-SAMPLE TEST 


(i) Rectangular distributions. 

(a) Difference in location. Let X be uniformly distributed in the range 0 to 1, 
and Y be uniformly distributed in the range A to 1 + A. Then it follows by 
simple integration that 


9 3 
P(yzy) = P(ayz) = 44 — A? + = ‘ Os As 1) 


so that 
3 
(15) p= 35+ 20" — 


(b) Difference in scale. Let X be uniformly distributed in the range —} to +4, 
and Y be uniformly distributed in the range —A to +A, where A > }. 
Then we have 


P(yry) = 4 — 1/244’, P(ayx) = 1/64 
so that 
(16) p = (124° — 44 + 1)/244° 

(ii) Normal distributions. 

(a) Difference in location. Let X and Y be normally distributed with the same 
variance o and means u; and wy respectively, where uw, — uw; = dc. If x is an ob- 
servation on X and y, and y: are two observations on Y, and if we define 

“m=IF—-", Uw™=w=I-— Ye 
u;, and wu, are jointly distributed in the bivariate normal form with means —éc, 
variances 20° and correlation coefficient 4. Then 


« 8/2 1 
P(yzy) = P(wu < 0) = Lok. evi 


2) 
5 [ti — 2ptte + &]> dt, dt, with p = 4. 
/ 


(17) 
—-;* oq — 2) , 
We also find the same value for P(xyx). These values have been tabulated for 
various values of 6 in [3) and can be used to evaluate p. 
(b) Difference in scale. Let X and Y be normally distributed with the same 
. 2 2 2 . . 
mean, say 0, and variances o, and o, ¥ oz. If u and wu, are defined as in the pre- 
vious case, they are now jointly distributed in the bivariate normal form with 
. 2 2 . . 27, 2 2 
means 0, variances o, + o, and correlation coefficient equal to o;/(o; + o,.) 
Therefore 


Plyzy) = Plus <0) = 3 — i gin o2/(e? + o?). 
T 


By a similar argument, we find 


bo ot SS 2 
P(zyz) = } — -sin” o,/(o, + o,). 
T 





144 R. M. SUNDRUM 


Hence, we have 


"4 2 ) 


(18) p= * sin” 


1 Cz 1 

=—5 + sin” =? 

Cz Cy oz + Oy) 

These methods of evaluating p can then be extended to cases where both loca- 


tion and scale are different in rectangular and normal populations. 


4. The distribution of L. In the null case, the exact distribution of L may be 
computed for small samples by enumerating the whole set of equiprobable per- 
mutations. As for the limiting case, L is an extension of a U-statistic defined by 
Hoeffding [1] and by Lehmann’s Theorem 3.2 ((2], p. 167), W/n(L — E(L)) has 
a limiting normal distribution under the condition m/n = constant. However 
in the null.case, the variance of L is of order n~? and the limiting normal dis- 
tribution of ~Wn(L — E(L)) is singular. 

Some idea of the exact distribution in the null case may be obtained from the 
following tables for small samples, which were obtained by complete enumera- 
tion of the various possibilities. 


m=n=z2 m=nz2=3 


6P(L = 2) P(L2 2) z P(QL =z) POL = 2) 
4 1.0000 S 1.0000 

2 0.3333 8 0.6000 

, ann ! 2 0.2000 

2 0.1000 


n=3 


x 10P(3L = xz) P(3L 2 z) 


1.0000 
0.6000 
0.2000 





m = 3; ‘ m=n=4 


z 35P(18L = xz) P(I8L 2 z) z 70P(36L =z) P(36L 2 z) 
4 1.0000 6 j 1.0000 
0.8857 { 0.7714 
0.6571 ‘ 0.4286 
. 5429 3 ; 0.2571 
0.4286 0.2286 
0.2857 7 0.1143 
0.2286 ‘ ; 0.0571 
0.1143 36 : 0.0286 
0.0571 





TWO-SAMPLE TEST 


- ; ‘ m=n=5 


x 126P(60L = z) P(60L 2 z) x 252P(100L = xz) P(100L 2 a) 


10 .0000 20 32 1.0000 

1] . 9683 24 64 0.8730 

12 7 .9048 28 48 0.6190 
8095 32 16 .4286 

14 ‘ .7148 36 26 .3651 

15 ‘ .6825 40 24 .2619 
.5873 ) . 1667 

17 .5159 48 . 1429 

18 .4841 52 ) -1111 

3889 ) .0873 

.3651 .0476 

.3016 ; .0317 

. 2381 ’ .0159 

aaa ’ .0079 

. 2063 

.1905 

. 1587 

. 1270 

-1111 

.0952 

.0635 

.0476 

.0317 

.0159 


21 
22 
24 


25 


CO DM W bo 


bo bo bo 


10 
48 
60 


f 
4 
2 
2 
1 
2 
2 
2 


to 


I am much indebted to Mr. William Kruskal for many suggestions which 
have greatly improved the form of this paper. 


REFERENCES 
[1] W. Hoerrpina, ‘‘A class of statistics with asymptotically normal distributions,’’ Ann 
Vath. Stat., Vol. 19 (1948), pp. 293-325. 
[2] E. L. Lenmann, “Consistency and unbiasedness of certain nonparametric tests,’’ Ann 
Math. Stat., Vol. 22 (1951), pp. 165-179. 
3] K. Pearson, Tables for Statisticians and Biometricians, Part II, Ist ed. Cambridge Uni 
versity Press, 1931. 





TABLES FOR A NONPARAMETRIC TEST OF LOCATION 


By 8S. RoseENBAUM 


Directorate of Army Health, London 


1. Introduction. These tables are complementary to the set for a test of dis- 
persion which has already appeared [1], and are derived on the same lines. 
Acknowledgements are again due to 8. 8. Wilks [2] in whose paper on tolerance 
limits (1942) the basic formulas were first given. 

To test whether two samples come from the same population, we merely 
count the number of points in one sample which lie outside an extreme value of 
the other sample. In the following account the end point is taken to be the 
greatest value, but the argument is identical if the smallest is chosen instead. 


2. Test that the samples come from the same population. If two independent 
random samples of n points and m points are drawn from a population with a 
continuous distribution function, the probability that s points of the sample 
of m will lie to the right of the greatest value of the sample of n is [2] 


i<j 


(1) Q, =n i Bin + m — 8,8 + 1) 


where B is the complete Beta function. 

For 8 < m, > i Q, is the probability that the value of s is not greater than 
& , or equivalently, 1 — > 10 Q, is the probability that s is greater than or equal 
to & + 1. We can therefore fix a probability level « and determine s such that 


2 ee QQ. = e< a. Q, . 


Tables of s = 8s + 1 are given for e = 0.95 and e = 0.99 over the range n = 1 
-,50,m = 1,--- , 50. 
For sufficiently large and approximately equal values of m and n, Q, is approxi- 
mately equal to 2.“*"’, and the critical values of s + 1 are 


, 


& +1 = 5 fore = 0.95, 
8o + I = 7 for ¢€ 0.99, 
REFERENCES 
{1] S. Rosenspaum, ‘‘Tables for a nonparametric test of dispersion,’? Ann. Math. Stat., 
Vol. 24 (1953), pp. 663-668. 


2] 8. S. Wicks, ‘“‘Statistical prediction with special reference to the problem of tolerance 
limits,’’ Ann. Math. Stat., Vol. 13 (1942), pp. 400-409. 


teceived 6/1/53 





1% values of s 


13 
11 11 12 1 
10 10 11 
9 9 10 


II J *J 


aaa oo 
onwnw or 

anus Zz 
sO 


or 


tN bt b&w & bo 
~~ be oo oO 
oor or or or 


“J J) I @ 
NJ ~ 18 @ 


snus 


oor or or 


to to bt te to 
or ee & bo 
www &O WwW 
Nn 
~“Isy J =) *] 


tS bt b& bo to 
-~ or cr or oF 


t te t 
zan~I mS 
WN tw te 
eoaoGw oH 


WN Ww b&w Ww 
oor or Gr Gr 
oor or ot 


tS bo 


agng a Co 


tS Ww b& b& to 
tS Ww Ww NW 
wonNwNw hb Ww 
oo or or or 
anna a 


to 


to tO Ww WwW 
bt tw WN be 
-~ oao oo 
oan” a Go oC 
ano oo 


to bt bt tb 


41 
42 
43 
$4 
45 


NS Ww & bt 
we Ww Ww WS bt 

te b& bt bo 
tS & Ww & bo 
aoa og 


te 
i) 


46 
47 
48 
49 
5O 


bt bw tv 


wo Ww & &W bt 


3 4 
:2i2eaq 4 
BYERLEL -.. 2 4 
eS) eco 4 { 
3 4 


to ts Ww w& bo 
tw bt Ww & bt 


tw tw 


The probability is less than 1% that s or more points of a sample of size m lie outside 
an end point of a sample of size n if the samples are drawn randomly from the same popula 
tion, whatever its distribution 





v0 


21 
18 


13 13 14 14 1 


9 10 10 11 


10 10 11 11 12 


18 19 20 20 
9 


15 16 16 1 


7 
9 


12 12 
9 10 


9 10 10 11 
8 8 9 


12 12 13 14 14 
9 


14 15 16 16 1 
10 11 11 


drawn randomly from the same popu- 


values of s 
12 13 13 
9 10 11 11 
9 910 


10 11 
9 


n if the samples are 


ze 
= 
a 
© 
= 
> 
= 
= 
© 
SN 
tm 
2 
— 
5 
° 
< 
= 
Q 
S 
S 
3 
3 
n 
a 
Mee 
© 
n 
~ 
S 
7 
° 
a. 
o 
G 
° 
= 
S 
5 
he 
= 
” 
2 
3 
= 
= 
> 
© 
4 
r=] 


89S 


‘ 


33333 


6 


The probability is less than 
an end point of a sample of size 


lation, whatever its distribution 





1% values of s 


36 37 38 39 


aoe 
_ 


te 
Ft 


oN b& 
— OH 


— we 
~ 


Nuns) =z 
NI -31 @ & 
aoowc 


suns 
Nuss 
nus oe 
“1-1 @ 
12 Om @ 


Nn ss) sJ 


NV -) 
Nn J sJ 
“Isis ss «4 
Nuno 3 = 
No ~1 @ @& 


36 


of 

38 
39 
40 


> sr s3 s7 47 
w 


Nn J 5] J 
Nn SJ 


Nn) J 
Nv 73 -J =] 
Nn J 

Nn SF 


41 


42 


44 
45 


sss 
sss 
Nous 
sss 
Nuss 
aus. 


Nv vw J Zz 
Nn @ @ 


Nn) J 


- 


i 


46 
47 
48 
49 
50 


6 > 6 6 6 G6 6 
6 5 6 6 6 6 6 
6 > 6 6 6 6 6 
6 Ye¢ €& € 6 


D > 6 6 6 6 6 


oo Go Go 
sass 
Nuvu J 
Nu ~ 
naw 
NNN 
anus 


~ 


rhe probability is less than 1% that # or more points of a sample of size m lie outside 
an end point of a sample of size n if the samples are drawn randomly from the same popu- 
lation, whatever its distribution 





5% values 


37 


wNonwnn. => 
y~ = 


Nryee 


mM or 


N~-1 0% & 


Nnuvuw =z 
NI *3 © 
NJ ~ *1 @ & 


te 


PN J 7 J 
Nos 7) J 


Sr w G bo 
“J s) 9 41 57 


Nsw! =z 


s“~-1 +! =< 
“I-12 xX 


tote wy 


ae org ao 


41 
42 
43 
14 
45 


16 4 4 

17 So 4 4 . j 5 
Sit Sse 4 { An alee { 
49 33 8 } j 4 
50 } } 1 


5 
5 
4 
} 


The probability is less than 5% that s or more points of a sample of size m lie outside 
an end point of a sample of size n if the samples are drawn randomly from the same popula 
tion, whatever its distribution 





ON THE PROBLEM OF CONSTRUCTION OF ORTHOGONAL ARRAYS' 
By EstrHer SEIDEN 


University of Chicago 


1. Summary. A method of constructing orthogonal arrays of an arbitrary 
strength ¢ is formulated. This method is a modification of the method based on 
differences, formulated by R. C. Bose [1] for the purpose of constructing 
orthogonal arrays of strength 2. It is shown further that each of the multi- 
factorial designs of R. L. Plackett and J. P. Burman [2], in which each factor 
takes on two levels, provide a scheme for constructing orthogonal arrays of 
strength 3, consisting of the maximum possible number of rows. 

An orthogonal array (36, 13, 3, 2) is constructed. The method used for its 
construction cannot lead to a number of constraints greater than 13. It is known 
however [3] that 16 is an upper bound for the number of constraints in this 


case; the problem as to whether this bound can actually be attained remains 
unsolved. 


2. Introduction. The theory of orthogonal arrays was developed by R. C, 
Bose and K. A. Bush [3].° Following their definition a k * N matrix A with 
entries from a set Y of s 2 2 elements, is called an orthogonal array of size 
N, k constraints, s levels, strength ¢; if each 4 & N submatrix of A contains all 
possible ¢ X 1 column vectors with the same frequency \. Such an array is 


denoted by the symbol (N, k, s, 4), and the number J is called the index of the 
array. Clearly N = Xs’. 

Hotelling [4] considered orthogonal arrays of strength two and two levels 
from the point of application of factorial designs to chemistry. His work was 
continued by Mood [5]. Plackett and Burman [2] studied orthogonal arrays, in 
their terminology multifactorial designs, from the point of view of an appiica- 
tion in physical and industrial research. Their work provided a complete solu- 
tion to the problem suggested by Hotelling. Some of the designs constructed by 
Plackett and Burman were analysed by Kempthorne [6], and Brownlee and 
Loraine [7]. They pointed out that in the cases considered the main effects are 
confounded with the first order interactions; hence the designs are inadequate 
when the assumption that there is no interaction between the factors is un- 
realistic. These remarks can be extended to all the designs constructed by 


teceived 3/9/53. 

' Research carried out at the Statistical Research Center, University of Chicago, under 
sponsorship of the Statistics Branch, Office of Naval Research. 

2 R. C. Bose asked me to mention that the theory of orthogonal arrays was started by 
C. R. Rao. The references to his papers are: (1) C. R. Rao, “Hypercubes of strength, d’ 
leading to confounded designs in factorial experiments,’’ Bull. Calcutta Math. Soc., Vol 
38 (1946), pp. 67-78. (2) C. R. Rao, ‘‘Factorial experiments derivable from combinatorial 
arrangements of arrays,’’ J. Roy. Stat. Soc., Suppl., Vol. 9 (1947), pp. 67-78. 


151 





152 ESTHER SEIDEN 


Plackett and Burman, in the sense that in each of them the main effects are 
confounded with at least some of the degrees of freedom belonging to the first 


order interactions. However this deficiency can be removed by constructing 


certain designs of strength greater than or equal to 3. All the designs of Plackett 
and Burman in which s is equal to 2, yield such extensions. Moreover, one 
more factor can be accommodated in these extended designs. 


3. Construction of arrays of strength 3 from arrays of strength 2. 

TueoreM 1. Let S be an ordered set of s elements eo, €:,°°* , @:-1. For any 
integer t consider the s‘ different ordered t-tuples of the elements of S. They can be 
divided into 8°’ sets, each consisting of 8 t-tuples and closed under cyclic permuta- 
tions of the elements of S. Denote these sets by S;, i = 1, 2,---, 8°’. Suppose 


that it is possible to find a scheme of r rows with elements belonging to S 
M1 M2 *** Gn 
. ; : wee 
(n = Xs) 


Gy =a ee Arn 


such that in every t-rowed submatrix the number of elements belonging to each 8S; 
is the same, say, equal to d; then one can use this scheme in order to construct an 
orthogonal array (As‘, r, 8, t). If in addition this scheme consists of an array of 
strength t — 1, then one can construct an orthogonal array (ds‘, r + 1, 8, 0). 
Proor. The sets S;(i = 1, --- , 8° ') may be, for example, defined as follows. 
Consider the s‘' distinct (t — 1)-tuples of the elements of S and let the first 
t-tuple of each S; be the vector (e, « 


Cig, *** » Ci,_,) Where e is an arbitrarily 
+ 
Ci 


10sen element of S and the remaining elements of the vector form one of the 
s‘ '(t — 1)-tuples made to correspond to the set S;. The additional s — 1 ¢- 
tuples of each of the sets S; are obtained from the first by cyclic permutation 
of the elements of S. 

An array (As‘, r, 8, t) can now be constructed. Let its first As’ * columns be 
identical with the scheme satisfying the conditions of the theorem. Then the 
array is completed by adjoining to these columns all the transformations of the 
scheme consisting of cyclic permutations of the elements of S. 

If the scheme consists of an array of strength ¢ — 1, then an additional row 
can be added of which, for example, the first \s‘"' elements are equal to the 
first element of S, the next As"* to the second element of S, and so on until 
all the elements of S are exhausted. 


19 


THeoreM 2. If s is 2, then any orthogonal array of strength 2 forms a scheme, 
satisfying the conditions of Theorem 1, for the construction of an array of strength 3. 

Proor. Denote the elements of the array by 0 and 1, and the index of the 
array of strength 2 by X. Let 


‘0 
S, = * 0 


0 





CONSTRUCTION OF ORTHOGONAL ARRAYS 53 


The theorem will be proved if we show that every three-rowed matrix of the scheme 
contains \ elements belonging to each S;,7i = 1, 2, 3, 4. Let zij(i, 7, k = 0, 1) 
denote the number of columns of any three-rowed matrix which contains 7 in 
the first row, 7 in the second row and k in the third row. Then clearly 


Dei Fis - Di Zin - > Lijk = X, 


from which it is easy to deduce the theorem. For example, xo + Xow = Xow + 
L110 OF Xoo = Tyo. Also, Tuo + Tin = A 80 that ooo + uo = A. 

We will illustrate Theorems 1 and 2 by the following example. Consider the 
orthogonal array (12, 11, 2, 2) constructed by Plackett and Burman [2]. 


11100010 
11000101 
L0001011 
00010110 
O00101101 
010001011011 
000010110 l 
00010110 


0 


It is seen that this orthogonal array of strength 2 satisfies the conditions of 
Theorem 2. Now we construct the orthogonal array (24, 12, 3, 2) by adjoining 
to this scheme its transformation obtained by interchanging zero and one. The 
12th row will consist of 12 zeros and 12 ones. 


OL11LO01L1LILOOOLO 
010111000101 
001110001011 
9VLILOGGlLoilis 
92 LT VG TSeTIV 7 
1 
I 
] 


LQOO0OLOOOLLIO!LI 
101000111010 
110001110100 
LO0O01LIOILOO!I 
L1001LLILOLOO1O 
VLUVUGPTIGTIEGCLILesi Teigeres 
0000101101 ] 
0001011011 ] 
0010110111001 
OL1D0LILOLILILIOOO 
SVLTIOGLLiaggeri 
0000000000001 


I 
1 1101001000 
LOLOO0O1LOOO!I 
O0OL001LOOOIL!I 
VIVeVi eget ti i 
10010001110 
Paw wa er es 8 8. 


I 
J 
| 


The described method of constructing orthogonal arrays of strength 3 renders, 
in case s equals 2, the maximum possible number of rows. This follows from 
Theorem 2A proved by Bose and Bush [3] which reads: “For any orthogonal 
array (As*, k, s, 3) of strength 3, the number of constraints k satisfies the in- 





154 ESTHER SEIDEN 


equality k < [(As’ — 1)/(s — 1)] + 1.” For s = 2 the inequality reduces to 
ks 4n. 


4. An array of strength 2. An orthogonal array (36, 13, 3, 2) will be con- 
structed. The construction is based on the method of differences formulated by 
Bose [1] and Bose end Bush [3]. It will be more convenient to use here the first 
formulation. It reads as follows: Let M be a module consisting of e elements. 
Suppose it is possible to find a scheme of r + 1 rows 


Goi, Go2, *** » Gn 


Qi, M2, °** » An 


Grn; Or2 5 ~ oe Orn 


such that (1) each row contains \s elements belonging to M; (2) among the 
differences of corresponding elements of any two rows, each element of M oc- 
curs exactly \ times, then we can use the scheme to construct an orthogonal 
array (As’, r + 2, s, 2). The following 12 X 12 scheme satisfying the conditions 
of the theorem, was found by trial and error: 


0 0 0 
Ad 


— 
_— 


0 


-~ 
— 


Ne bo bo 
NN eK Nw = = 
bo = bo to 


~ 


— = 


) 
2 0 

20102 
0100202: 


Twelve rows of the (36, 13, 3, 2) array are obtained by adjoining to this scheme 
its two transformations consisting of cyclic permutations of the elements zero, 
one and two. The 13th row will be added by putting, for example, four zeros, 
four ones, and four twos under each of the schemes in the same order. 

Two questions now naturally arise. The first is whether it is possible, using 
the same method of construction, to build a scheme consisting of a number of 
rows greater than 12. The second is whether it is possible to use a scheme of 
12 rows to construct an orthogonal array consisting of more than 13 rows. 
Both questions will be answered in the negative. 


0 1 


2 
1 
12 
2 


m bo bot 


10 
01 
12 


The proof is based on an algebraic property of orthogonal arrays pointed 
out by Bose and Bush {3}. 


Let n;; denote the number of columns that have j coincidences, that is, 7 





CONSTRUCTION OF ORTHOGONAL ARRAYS 155 


elements equal, with the 7th column. A necessary condition for an array to be 
n orthogonal array (N, k, s, t) is that whatever be the number A such that 
<h & t, the following equalities hold: 


> j<0 n;Ch = Ci(rs*” — 1) fort = 1,2,---,WN. 


In the case considered h takes on the values 0, 1, and 2 and the condition re- 
duces to 


( kb 
> ni = ds? — 1 


j=0 


k 
(*) ‘ > jn = kas — 1) 


jan) 


k 
> IG — ny = kk — DA — 1 
j=0 
Consider now the first r + 1 rows of an array constructed with the aid of the 
theorem of Bose. It is easy to see that nw 2 s — 1 for all 7. Clearly these in- 
equalities hold also for any subarray extracted out of the first r + 1 rows. This 
means that in the case considered n» 2 2 as long as we deal with the first 12 


rows of the above constructed array. On the other hand for k = 12, no» S 2 
because otherwise the square of the deviation of the value 7 = 0 from the mean 
of the j’s would exceed the total sum of squares of the deviations from the mean. 
Hence for k = 12, nw = 2 for all 7. Furthermore, nj = 2 implies ny = 33 and 
ni; = O for j # 0, 4 and all 7. This can be shown by applying equalities (*) and 
noticing that if k = 12 and nw» = 2, then Q = > ja(j — 4)’n;; = 0. 

Such an array could not include a scheme consisting of 13 rows, because the 
solutions ni = 2, ny = 22, ns = 11 andn;; = 0,7 # 0, 4, 5 do not satisfy the 
equations (*) for s = 3, = 4, k = 13. This answers the first question. 

To answer the second question notice that if fork = 12,\ = 4,8 = 3 the 
solution of the equations (*) are ni = 2, n4 = 33, ni; = O for 7 # 0, 4 and 
all 7, then for k = 13 the corresponding solutions are ny = 2, ny = 24, nis = 


9 and n,;; = 0 for 7 # 1, 4, 5. This means every set of three columns belonging 
to the subarray of 12 rows and closed under the cyclic transformations of the 
elements of the array has the same element in the 13th row. It is seen that 
this condition together with the condition that each element of the array has 
to appear twelve times in each row will suffice in order to construct the 13th 
row. 


Let us see now whether one could add a 14th row to this array. It is enough 
to consider the solutions of equations (*) for the unknown values of na, ns, 
nie provided that n» = 2. This follows from the following reasoning. For 12 
rows Nio = 2, Nw = 33 and n;; = 0 for 7 # 0, 4. We may identify the two col- 
umns which have no coincidences with the ith column as / and /’. For 13 rows 
ni; = 2 and clearly these columns must be / and I’. Hence by first considering 
rows 1, 2,---, 13 and then rows 1, 2,--- , 12 and 14 it follows that columns 





156 ESTHER SEIDEN 


l and I’ will have the same 2-tuple in rows 13 and 14 that column 7 has. Hence 
ny = 2. For k = 14, nx = 2 implies nu = 16, ns = 16, ne = 1, nj = O for 
j # 2,4, 5, 6. Consider 7 equal to 1. We may assume that the last two elements 
of the first column are equal to zero. Then because ni. = 2, the last two ele- 
ments of the 13th and 25th columns will be equal to zero. Since nig = 1 there 
exists one more column different from the Ist, 13th and 25th which has the last 
two elements equal to zero. Let us denote this column by 7’. By our assumption, 
Nive = 2; hence two more columns will have to have the last two elements equal 
to zero, consequently the assumptions that \ = 4 and the array is of strength 2 
would be incompatible for k = 14. 

I wish to express my thanks to R. C. Bose for suggesting this problem and 
to L. J. Savage and C. M. Stein for stimulating discussions during the prepara- 
tion of this paper. Thanks are also due to the referee whose comments helped 
to improve the formulation of the paper. 


REFERENCES 

[1] R. C. Bose, ‘‘Mathematies of factorial designs,’’ Proceedings of the International 
Congress of Mathematicians, 1952, pp. 543-547. 

(2) R. L. Puackerr ann J. P. Burman, ‘‘The design of multifactorial experiments,’’ Bio- 
metrika, Vol. 33 (1943-1946), pp. 305-325. 

(3) R. C. Bose ann K. A. Busu, “Orthogonal arrays,’”’ Ann. Math. Stat., Vol. 23 (1952), 
pp. 508-524. 

[4] H. Hore..ine, ‘Some improvements in weighing and other experimental techniques,” 
Ann. Math. Stat., Vol. 15 (1944), pp. 297-306. 

5) A. M. Moon, “On Hotelling’s weighing problem,’’ Ann. Math. Stat., Vol. 17 (1946), 
pp. 432-446. 

[6] O. Kempruorne, “A simple approach to confounding and fractional replication in 
factorial experiments,’’ Biometrika, Vol. 34 (1947), pp. 255-272. 

[7] K. A. BrowNuege anv P. K. Loraine, “The relationship between finite groups and 
completely orthogonal squares, cubes and hyper-cubes,’’ Biometrika, Vol. 35 
(1948), pp. 277-282. 





ADMISSIBLE TESTS FOR THE MEAN OF A RECTANGULAR 
DISTRIBUTION’ 


By ALLAN BIRNBAUM 
Columbia University 


1. Summary. Explicit characterizations sre given of the minimal complete 
class and a minimal essentially complete ciass of tests of a simple hypothesis 
specifying the mean of a uniform distribution of known range. Examples are 
given of tests which are optimal against various alternatives. 


2. Introduction. Let pe(x) be the density function of a uniform distribution 
with mean @ and known range R. Without loss of generality we may assume 
R = 1, so that 


(l ifo-k<25 6+}, 


p(t) = 
\0 otherwise, 


for any real value of 6. 

Consider the problem of testing a simple hypothesis specifying the value of 
6 on the basis of a sample of n(n 2 2) random independent observations 
21, %,... Zn. Without loss of generality we may take the hypothesis to be 
H,:6 = 0. We shall consider tests of Hy against the general composite alterna- 
tive hypothesis H,:6 # 0. 

It is known that a minimal sufficient statistic [1] for p(x) is (u, v), where 
u = min;z;,v = max;2;. For all statistical purposes we may restrict our at- 
tention to the range of (u, v) as sample space rather than the range of 
(a) , %2, °** Xn), a8 is shown for example in [2]. Thus we take as sample space 
T = {(u,v)|u Sv <u + 1}. Any test procedure may then be represented by 
a decision function 6(u, v), where 6(u, v) is a real-valued Lebesgue-measurable 
function defined on T’, satisfying 0 S 6 (u,v) S 1, and such that the test pro- 
cedure rejects Ho with probability 6(u, v) when (u, v) is observed. Hereafter by 
‘‘a test 6” we shall mean a test represented by a Lebesgue-measurable function 
6 = 6(u, v) of the kind just described, and hereafter “‘measure’’ will refer to 
Lebesgue measure. It is to be noted that in the following sections “the class of 
all tests” will be understood to be D = {6(u, v)}, and not the class of all decision 
functions 5(x; , %2, *** Zn) defined on the original sample space. 

The distribution of (u, v) is given by the density function 


po(u,v) = n(n — 1) ke(u, v)(v — u)”” 


Received 6/15/53. 
1 Work sponsored by the Office of Naval Research. 
157 





158 ALLAN BIRNBAUM 


where 
‘1 f@-}<usves 
k(u,v) = : 
0 otherwise. 


3. Characterization of the minimal complete classes. For any test 4, let 
8,(0) be the probability of accepting Hy when 6 is the true mean; that is, 


8;(0) = i pou, v)(1 — b(u, v)) du dv. 
" 


Let r;(0), the risk function of the test 6, be defined as the probability of an in- 
correct decision; that is, 

{Bs(8), 6 # 0, 

1 — 6,(0), 6 =0, 


r;(0) = 


Let W be the class of nonrandomized decision functions each having as accept- 
ance region the interior of the subset of T, = {(u,v)|—4<u sv < }} 
which lies above the graph of an arbitrary nondecreasing function v(u). 

The class % of tests is defined as follows: 6 ¢ & if and only if there exists a 
5’ ¢ A’ such that the set 


{(u, v) | b(u, v) A 8’(u, v)} 
has measure (). 
THeoreM 1. & is the class of essentially unique Bayes solutions. 
Proor. For any test 6 and any cumulative distribution function £(@) we have 
r(t) = | rs(0)dé(6). 
From the definition of r;(@) we have, letting y = &(0+) — &(O—), 


rile) = [ 6s(0) deo) + [ (1 — 28,00)) ak) 


=y+ i [E(u + 4) — Ev — 4) — 2y] polu, v)(L — &(u, v)) du dv 
To 


+ i | f po(u, v) ax(o) | (1 — &(u,v)) du dv. 
T-—T,) L/—@ 


To minimize r;(£) with respect to 6 it clearly suffices to define 


iO if E(u + 4) — &(v — 4) — 2y < O, 
(1) d(u,v) = : 
|\1 otherwise. 


Now assume 6 ¢ W. Let v(u) be that single-valued nondecreasing function 
which characterizes 6, in the manner described in the definition of % above, to 


within a set of measure 0. Let u(v’) = inf {u | »(u) 2 v’}, —4 Sv’ S . 


. 





ADMISSIBLE TESTS 159 


Let £(6@) be any cumulative distribution function which places zero mass in 
the open interval (—1, 1) and positive density at every other point. Let ¢(@) = 
4(£,(@) + £&(0)), where &(@) is the cumulative distribution function defined by 


‘0, és -1 


| 

414 + ul@+ 3), -1<0s50, 
j 

: 4 

9 = 

ie 

\1, 1< @. 


0<@s1, where y = £(0), 


To verify that £(@) is nondecreasing, it suffices to note that u(v’) is nondecreas- 
ing and never exceeds 4, and hence y S 14. We obtain as a Bayes solution rela- 
tive to £(6), after simplification, 


(0 if (u,v)eT> and u < u(r), 
do(u, v) = < 
\1 otherwise. 


It is clear 4) is an essentially unique Bayes solution relative to £(@), and hence 
that 6 is also. 

Conversely let 6) be an essentially unique Bayes solution relative to some £(@). 
Then one Bayes solution with respect to £(@) is given by 6 as defined in (1) 
above. Since 4 is an essentially unique Bayes solution relative to §(0), 55 = 6 
almost everywhere. Since &(u + 4) — &(v — 4) is nondecreasing in u and nonin- 
creasing in v, it follows that 6 e¢ UM and hence that 4) ¢ %. 

THEOREM 2. Y’ is a minimal essentially complete class. 

Proor. Let ¢ be the set of £(@)’s which are everywhere strictly increasing, 
and let C; be the class of Bayes solutions relative to members of ¢. Then the 
assumptions of Wald’s Theorem 3.19 in [3] are satisfied; the conclusion of the 
theorem asserts that the closure C; of C; is essentially complete. 

To show that % > C; , let £ be any element of ¢, and let 5 be the corresponding 
Bayes solution given by (1). The derivation of (1) shows that 6 is essentially 
unique if — is everywhere strictly increasing. Since 6 ¢ UA, A D C;. Since % is 
closed, % D> C,; . Thus % is essentially complete, and it follows that Wl’ is essen- 
tially complete. 

Let 6 and 6’ be any two different elements of Yl’. Since 6 # 6’ on a set of posi- 
tive measure, and since each is an essentially unique Bayes solution, it follows 
that for some 6’, r;(6’) > rs'(6’), and for some 6”, r;(0”) < ry’(0”). Thus no ele- 
ment of %’ is uniformly as good as any different element of %'. Hence W’ is 
minimal essentially complete. 

Coro.uary. Y is the minimal complete class. 

Proor. It was shown above that % is an essentially complete class consisting 
of admissible tests. If 6 is admissible but not in 4M, then Y contains a test 6’ with 
ry'(0) = 17,(6@). Since 6’ is an essentially unique Bayes solution, 6 = 6’ almost 





160 ALLAN BIRNBAUM 


everywhere, and 6 ¢ A, a contradiction. Hence there is no admissible test not in 
YW, and % is complete. 

It is interesting that the conclusions of Wald’s Theorems 5.5 and 5.7, which 
characterize complete classes of tests, become virtually empty in the case of the 
present problem because these classes contain virtually all tests with acceptance 
regions in 7’). 


4. Examples. 

Example 1. One-sided alternative. If a test of size a having high power against 
alternatives @ > 0 is desired, the Neyman-Pearson lemma may be used to con- 
struct the (essentially unique) best test of Hy against the simple alternative 
H,:6 = 1 — a’. This test is characterized by 

| u, —$}su<}-a'” 
v(u) = 4 - 
| 4, t$-—-a” sus h. 


; 


By using the Neyman-Pearson lemma to construct best tests of Hy of size a 
against any simple alternative H,: 6 = 6,0 < & < 1 — a”, one can verify 
that the above test has, at each 6, 6 > 0, the maximum power attainable by 
any test of size a; hence the above test is uniformly most powerful against the 
composite alternative H,: @ > 0. 

Example 2. Two-sided alternative, locally best tests. If a test of size a, hav- 
ing the greatest possible power against alternatives @ close to zero, is desired, 
we may take the test characterized by 


u, 


( 


l/n 
\ax 
v(u) 71 a? 
| u, 
As in Example 1 it can be verified that this test has at each @,| @| < (1 — a’’"), 
the maximum power attainable by any test of size a; hence this test is uniformly 
most powerful against the composite alternative H,: 0 < | @| S }$(1 — a’). 
Again using the Neyman-Pearson lemma as above it can be shown that this 
test is, among all admissible tests, uniformly least powerful against the composite 
alternative H;: | @| = 4(1 + a’). It is of interest to have such a simple example 
of a locally best test which is the worst possible of all admissible tests against 
certain “intermediate” alternatives (all tests A contained in 7) being good 
against ‘‘distant”’ alternatives). 
Example 3. Two-sided alternative with indifference zone. If a test of size 
a is desired having the greatest possible power against all alternatives except 
possibly values of @ close to 0, we may take the test characterized by: 





ADMISSIBLE TESTS 


I/n 
-tsus($)-1 


a\'* 
(5) —-4s;s3u<}- 


- 


Case B. 


where v) is determined by 


—vo ph 
l-a= | | polu, v) dv du, 
-} vo 


Comparison with the power function in the previous example shows that there 
exists no uniformly most powerful unbiased test of Hy . 


5. Acknowledgement. The author is indebted to Dr. Milton Sobel, Professor 
Henry Scheffé, and the referee for their helpful suggestions. 


REFERENCES 
[1] E. L. Leymann anp H. Scuerr®&, ‘‘Completeness, similar regions and unbiased estima- 
tion, Part I,’’ Sankhyd, Vol. 10 (1950), pp. 305-340. 
[2] P. R. Hatmos anp L. J. Savaae, ‘Application of the Radon-Nikodym theorem to the 
theory of sufficient statistics,’’ Ann. Math. Siat., Vol. 20 (1949), pp. 225-241. 
[3] A. Waxp, Statistical Decision Functions, John Wiley and Sons, 1950 





NOTES 


THE MONOTONICITY OF THE RATIO OF TWO NONCENTRAL tf 
DENSITY FUNCTIONS! 


By WILLIAM KRUSKAL 
University of Chicago 


1. Summary. The ratio of two different noncentral ¢ density functions with 
the same number of degrees of freedom is strictly monotone, with sense depend- 
ing on the relative values of the two noncentral constants. 


2. Background. The ratio of two noncentral ¢ density functions has arisen in 
several statistical connections. First, in the proof that the Student (-test is 
uniformly most powerful invariant, the ratio of a noncentral ¢ density function 
to a central ¢ density function arises. This is discussed by Lehmann ((4], chap. 4) 
who gives a proof of monotonicity. 

Second, the same ratio arises in the study of sequential ¢-tests; a discussion of 
this is given bv Arnold in [1]. 

Third, the case in which both numerator and denominator are noncentral ¢ 
density functions arises in connection with a sequential test for (one-sided) 
fraction defective. A discussion of this is given by Rushton [5], and an earlier 
reference to the same sequential test appears in Selected Techniques of Statistical 


Analysis ({2], p. 83, footnote). In this case, as well as in that of the above para- 
graph, monotonicity of the ratio is of interest because it implies that at any 
stage of sampling the continue-sampling values of the natural test statistic— 
Student’s (—form an interval. 

The purpose of this note is to give a very simple proof of the monotonicity 
of such ratios. The method is similar to that used by Wald ({6], Section A.8.2). 


3. Statement. The noncentral ¢ density function with » degrees of freedom and 
noncentral parameter 4 is 


(vy ote 1) 4 (v+1) — $i 
(3.1) o(t; v 5) a 4-1) y pins ( v ) en tlt? (ott?) Hh, ( : =) 
ae 2 (2) Va v+? Vit? 


where 


’ 
y 


Q< ; “ —$(2+z)? 
(3.2 — 7 é dz. 
) | r(v + 1) A 
Received 5/11/53. 
1 Based on research supported by the Office of Naval Research at the Statistical Re- 


search Center, University of Chicago 


162 





NONCENTRAL ft 163 


This quantity is the density function for (U + 6)/-/W/» = t where U and W 
are independent random variables having respectively unit-normal and x‘(v) 
distributions. (The noncentral' ¢ density function is readily derived from the 
joint distribution of LU’ and W. It may be of interest to mention two minor mis- 
prints in the statement of this function by Johnson and Welch [3]. In their (2) 
the function should be divided by +/»; and in their (3) a minus sign should 
appear in the exponent.) 

If we consider two such density functions for the same v but with 6 = 6 
and 6, respectively the natural logarithm of the ratio of the two is 
v — Wy, 


o(t; v, 51) ox sam ae 2 (— ) 
o(t; v, bo) o ¢ v + 2 (61 50) + In Hh, Vv + P 


— in ’ V/y > 2 ° 


In 
(3.3) 


I shall prove the following 


THeoreM. If 5) # 6;, (3.3) is strictly monotone, increasing if 59 < 6, and de- 
creasing if 65 > 6, . 


4. Proof. Replace the independent variable ¢ by the following strictly increasing 
function of it: 


t ‘aan =A 
( = ——— = - 
4.1) u Ve + t u Vj a 


so that » + @ = v/(1 — u’) and we may write (3.3) in the form 
fe tent? dz 
—43(1 — u’)(6i — 6) + In 2, 


_ —bgu)? 
ze 1)” dz 
0 
o 
ge tte dz 


= —4}(6; — 5) +n ~, ‘ 
| ze §22+udo2 dz 
0 


Differentiate (4.2) with respect to u and observe that the stated theorem i 
equivalent to the statement that the sign of 


_ * 


22+udg2 1 —he2+ud)2 
| z’e tt tubo ge | 62h ft dz 
0 “0 


(43) 


p@ @ 
v z2+ub v+] 2+ud 
- | Ze % dz | Bg2’ tte te dz 
0 0 


is the same as the sign of 6; — 6). By rewriting each of the above terms as a 
double integral in (2 , z,) and combining, we see that the desired result is further 





164 WILLIAM KRUSKAL 


equivalent to showing that the sign of 


(4.4) l [ (zo21)” esate tu Cosotiien) (5 a 50 Zo) dzo dz, 


is the same as the sign of 6; — 4). 
From (4.4) the truth of the theorem is immediate if either 6; or 5p is zero, or if 


6) and 6; have opposite signs. Now suppose for simplicity that both 4 and 6; are 
positive. 


Rewrite (4.4) in the following way: 


I, 8o29> (zo 21)" eNeptepte(orettiar) (6121 — 5920) dzo dz; 
z 2970 
(4.5) 141>°%0#0 


—4 (22422) 3 3 
(z021)" e ' agreg remover nes? (50 20 — 61 21) dz dz, 
0<8;21<So29 


and make the following changes of variable: 


First Double Integral Second Double Integral 
Zo = 8&/do Zo = 8/bo 


Zz) = 8/6; 


to obtain 


(6953) = [J ‘ (8 8:)"er" "°F" (g, - 80) 
(4.6) 1>*0> 


. 2 2 2 
| exp (-1 (® + 9) — exp (-3 (& + “)) | dsq ds}. 


Hence the desired conclusion would be implied by the result that the sign of 


8, 2 3, 8 
47 0 =i si 1 Ba 0 
(4.7) at 5 a 8? 


is opposite to that of 6; — 5) so long as s; > & > 0. But (4.7) may be written as 


1 ] 2 2 
(3 — is) (8 — 8) 


whose sign is that of 5) — 6, . This completes the proof for every case except that 
of 5) , 5; both negative. But this goes through with obvious minor modifications 
in (4.5) and the subsequent manipulations. 

I should like to thank Charles M. Stein for helpful comments made after 
reading a draft of this note. 


REFERENCES 


{1} Kennetu J. ARNOLD, Tables to Facilitate Sequential t-Tests, National Bureau of Stand- 
ards, Applied Mathematics Series 7, 1951, pp. v-xiii. 
{2} W. Aten Watts, ‘‘Use of variables in acceptance inspection for percent defective,” 





BOREL-CANTELLI LEMMA 165 


Selected Techniques of Statistical Analysis, chap. 1, Statistical Research Group, 
Columbia University, New York, McGraw-Hill Book Co., 1947. 

[3] N. L. Jounson anp B. L. Wetcu, “Applications of the noncentral t-distribution,”’ 
Biometrika, Vol. 31 (1939-1940), pp. 362-389 

[4] Ericu Leumann, Theory of Testing Hypotheses, Mimeographed notes recorded by Colin 
Blyth, ASUC Bookstore, University of California, Berkeley, 1948-1949. 

[5] S. Rusuton, ‘‘On a sequential t-test,’’ Biometrika, Vol. 37 (1950), pp. 326-333. 

{6] ABRAHAM Wa xp, Sequential Analysis, John Wiley and Sons, Inc., New York, 1947. 


———{ a 


AN EXTENSION OF THE BOREL-CANTELLI LEMMA 
By Strantey W. Nasu 


University of British Columbia 


1. Introduction. Consider a probability space (Q, 5, P) and a sequence of 
events {A,}, An € F,n = 1,2, --- . The upper limiting set of the sequence is 
defined to be 


lim sup A, = M U Ay. 

n-~ao k>n n=l ken 
It is the event that infinitely many of the A, occur. The purpose of this paper is 
to find necessary and sufficient conditions for P(lim sup A,) = 1. 

The general problem of finding the probability of an infinite number of a 
sequence of events occurring was considered by Borel [1], [2] and Cantelli [3). 
In what follows we shall use the following notations. Let a, = J(A,), the indi- 
cator of the event A, (or characteristic function of the set A,), that is 


(1 when A, occurs 
an = 4 
lo when A,, fails to occur. 


Let P(A, | ajay --+ @,1) denote the conditional probability of the event A, , 
given the outcomes of the previous n — 1 trials. When n = 1, the expression is 
taken to represent the unconditional probability P(A,). The 1912 Borel criterion 
stated: 


If 0 < p, S P(An| ara «++ any) S pa < 1 for every n, whatever be a , 
@,*** , 1, then >-31p; < © implies that P(lim sup A,) = 0, and 
<1 P) = © implies that P(lim sup A,) = 1. 
Cantelli proved that aa P(A;) < @ always implies that P(lim sup A,) = 0. 
Paul Lévy [4] clarified the general problem by proving the following theorem. 
The subset K (or K’) of the sample space 2 for which 


p> * P(A ;\aya2 cee @j~-1) < © (or =o) 


and the subset H (or H’) of 2 for which lim sup A, fails to occur (or occurs) 
differ at most by a set of probability 0. In other words P(KH’) = P(K’'H) = 0 
and P(KH) + P(K'H’) = 1. The hypothesis of the theorem proved in the next 





166 STANLEY W. NASH 


section ensures that KH is a null set, so that P(K’H’) = 1. However, the proof 
given is direct and independent of Lévy’s result. 

Loéve [5] found necessary and sufficient conditions for P(lim sup A,) = 0. 
Let pur = P(Ak | On = Gng: = +++ = x1 = 0) for k > n, and let pun = P(A,). 
The criterion states: 

If lim, Fan Pre = 0, then, and only then P(lim sup A,) = 0. 
Chung and Erdés [6] mentioned the sufficiency of the following criterion for 
P(lim sup A,) = 1.’ 

If >, pa = & for every n, then, and only then, P(lim sup A,) = 1. 


2. A necessary and sufficient condition for (lim sup A,) = 1. 
CRITERION. If > Jat P(A; | ayaa +++ aj) = © for every sequence 


Qj0l9 °** An **? 


of outcomes of trials for which only finitely many a, = J(A,) = 1 and no 
P(ayay +++ a,) = 0, then, and only then, P(lim sup A,) = 1. 

Proor. The class H of sequences \ = aja. +--+ a, «+: , for which only finitely 
many a, = /(A,) = | is denumerable, for its members can be put into one-to-one 
correspondence with rational numbers between 0 and 1, say with those whose 
binary expansions have the corresponding sequences of zeros and ones. It fol- 
lows that, if P(A) = 0 for every \ ¢ H, then and only then, > vee P(A) = 0, 
that is P(Only finitely many A,) = 0, and consequently P('nfinitely many 
A,) = 1. 

Consider first those sequences \ ¢ H for which P(ajaz --+ a, as in A) = O for 
some finite n. For such sequences P(A) S P(ajae «++ a, as in A) = O, since the 
event \ is a subset of the event that the outcomes of the first n trials are the 
same as they are in X, an infinite sequence of trials. Restrict further consideration 
then to those sequences \ ¢ H for which P(aya. --- a,) > O for every n. For 
such sequences all conditional probabilities P(a, as in Xd | aya, --+ a@,_; as in A) 
are defined and positive. Accordingly 


P(A) = [ju P(a; as in X | ayaa +++ aj_; a8 in A) 
= []F-1 {1 — Pla; not as in d | aos «++ aj, as in d)}. 
The infinite product for P(A) is zero if, and only if 
> ja P(a; not as in d | aya +++ aj. asindA) = “, 
For any \ ¢ H all but finitely many a, = 0. Thus the series 
> ja! P(a; not as in d | aya, +++ aj_; as in d) 
Received 5/19/53 
1 In a communication to the referee of this paper, Chung and Erdés point out that the 


statement given in their paper is wrong. There the condition ‘“‘for every n’’ 
Also, the proof should not have been attributed to Borel. 


is omitted. 





BOREL-CANTELLI LEMMA 167 


and >°3; P(aj = 1| ay: --- aj. as in \) differ in only finitely many terms, 
hence converge cr diverge together. Therefore, P(A) = 0 if, and only if, 


> fer P(A; | ayae «++ aj, a8 in A) = &, 


But >°,.» P(A) = P(Only finitely many A,) = 0 in this case, and so P(lim 
sup A,) = P(Infinitely many A,) = 1. Q.E.D. 


3. An application. Borel’s criterion for P(lim sup A,) = | is easily seen to be 
a special case of the criterion of Section 2. To show that the generalization 
achieved is not trivial consider the following example. Two urns each contain 
a red and a black ball at the beginning of the experiment. A ball is drawn at 
random from the first urn, its color noted, and the ball is returned to the urn. 
This is repeated until a black ball is drawn. Each time a red ball is drawn from 
the first urn, the number of balls in the second urn is doubled by putting in as 
many red balls as there were balls of either color in the urn before. Once a black 
ball has been drawn from the first urn, all further draws are at random from the 
second urn with replacement after each draw. No further change is made in the 
composition of the contents of the second urn. Let A, designate the drawing of 
a black ball in the nth trial. Consider the sequences of trials for which the 
(k — 1)st trial is the first time a black ball is drawn. Then a,, = 1 but a, = 0 
for h < k — 1. The second urn will contain 2* balls at the kth trial and there- 
after, 2* — 1 red and 1 black. Thus 


‘4forn <k 
P(A, | Qt Qg*** Qn—1) Ss (k > 1) 
(2° forn 2 k. 


Then p, = inf, P(A, | ayaz «++ any) = 2°" > 0. But D3 pj = 93127 = 1 
converges, so the hypothesis of Borel’s criterion does not hold and its conclusion 
can not be inferred. But >°7.1 P(A; | aay --- a1) diverges for every possible 
sequence \. In particular it diverges for every \ ¢ H, where only finitely many a, 
are ones. Thus, by the criterion of Section 2, P(Infinitely many A,) = P(Black 
drawn infinitely often) = 1. 


REFERENCES 

[1] Emite Bore, “Les probabilités dénombrables et leurs applications arithmétiques,” 
Rend. Circ. Mat. Palermo, Vol. 27 (1909), pp. 247-271. 

{2} Evite Borex, “Sur un probléme de probabilités relatif aux fractions continues,” 
Math. Ann., Vol. 72 (1912), pp. 578-587. 

[3] Francesco Pao.to CanTELuI, ‘Sulla probabilité come limite della frequenza,’’ Rend. 
Accad. Lincei, Series 5, Vol. 24 (1917), Semester 1, pp. 39-45. 

[4] Paut Livy, Théorie de l’Addition des Variables Aléatoires, Gauthier-Villars, Paris, 
1937, pp. 249-250. 

[5] Micnet Loéve, ‘On almost sure convergence,’’ Proceedings of the Second Berkeley 
Symposium on Mathematical Statistics and Probability, University of California 
Press, 1951, pp. 279-303, (see pp. 282-283). 

[6] Kar-La1 Catuna anp Pact Erpés, ‘On the application of the Borel-Cantelli lemma,” 
Trans. Amer. Math. Soc., Vol. 72 (1952), pp. 179-186. 





HERSCHEL WEIL 


THE DISTRIBUTION OF RADIAL ERROR 


HERSCHEL WEIL 


University of Michigan, Willow Run Research Center 


1. Summary. An expression (equation 2) suitable for computation is obtained 
for the probability distribution of ¢ = (€ + 7°)! where — and 7 are independent 
Gaussian variables with, in general, unequal means and unequal variances. 


2. Introduction. The probability distribution for ¢ has direct application in 
considering the radial error in situations where a point must be located by two 
coordinates as in navigation. It also has application in the study of turbulence 
in fluids [1] and in other fields. In these applications § and n are not necessarily 
uncorrelated. However since a rotation of coordinates reduces the mathematical 
problem in which the variables to be combined are correlated to one where they 
are uncorrelated, and 7 are considered to be uncorrelated in this note. 

Before deriving equation (2) it is desirable to point out how the present result 
fits in with related work in the literature. An expression for the characteristic 
function for ¢ is given by Patnaik [3] whose result in fact holds for the n dimen- 
sional case, n = 1. If the variances of — and 7 are each a’, the variable ¢°/o’ is 
governed by the noncentral x’ probability distribution with two degrees of 
freedom. The noncentral x’ distribution in n dimensions is given in terms of an 
infinite series in [3] and in Tang [6]. The series essentially represents a Bessel 
function J;(x) of the first kind, imaginary argument and order 7 = $n — 1 so 
that one can write for the probability density of o’x’. 


1 seiieain tie x (n—2)/4 r. 
(1) g(x) sec 2a? . , , (2) lin : “ 


where \ is the sum of squares of the means of the n variables. This result in the 
two dimensional case is used to represent the probability density for the noise 
plus signal following a quadratic detector in electrical circuit theory and is 
derived and applied in [2]. 
The “method of mixtures’ 


, 


due to Robbins and Pitman is used by them in 
[4] to express the noncentral x’ distribution as an infinite sum of central x’ 
distributions in the n dimensional case and the method can be applied to the 
present problem involving unequal variances. In addition a form of the distribu- 
tion for ¢ in the two dimensional case where one mean is zero is given by Frenkiel 
in [1]. Frenkiel’s result is an infinite series in which the nth term involves the 
(2n)th derivative of J,(x2). Both this result and the result by the method of 
mixtures appear to the writer to be less adaptable to computation than equa- 
tion (2). 


3. Result and derivation. The method used here to obtain p(r), the probability 
density or frequency function of ¢, is simply to write the bivariate Gaussian 


Received 3/13/52, revised 7/13/53 





DISTRIBUTION OF RADIAL ERROR 169 


distribution in polar coordinates and integrate over the angular variable. The 
immediate result is a double summation of products of Bessel functions which 
is then reduced to a single summation by application of an addition theorem 
for Bessel functions. The end result is 


2 2 2 
p(r) = Ar exp E (os - 


4a} 03 


(2) 


oa) 


| solar (dr) + 2 .™ I (ar) Is; (dr) cos 2iv, 


where 


Here oj and o; are the variances of ¢ and 7 and m, and m, their means. (In [8] 
the J ;(x) are tabulated for j == 0(1)20, x = 0(.1)20 or 25. In [5] they are tabu- 
lated for 7 = 0(1)22 and x = 0(.01 or .02)5.) Note that equation (2) with o, = 
o2 = o reduces, as it should, to 2rg(r’) when g(r) is given by equation (1) evalu- 
ated for n = 2. 

To derive the result (2) let p(z, y) be the bivariate Gaussian probability den- 
sity which governs ¢ and n. Then in polar coordinates x = rcos @ and y = rsin @, 
the corresponding probability density is 


[=r +o) 


Don 
4o; 09 


(3) P(r, 6) = & exp 
Tv 


~ 


| exp [ar’ cos 26 + br cos 6 + cr sin 6]. 


The second exponential is to be integrated over @ from 0 to 2x. Call the resulting 
integral L. By substituting into L the expansion for the usual generating func- 
tion for the Bessel coefficients J,,(z); 


(t—1 
(4) Ge a oe oe 
expressed in the forms 
(5) ersine es > s In(z)e"@r evened i a 1,(z)e" 


one obtains 


«© 


L - | a a ee 

i jm—O ke—O lee—D 

(6) 
<I ,(ar’)1,(br)I,(er) exp (2, +k+D+ ae}: 


} 





170 


The order of integration and summation may be interchanged. From the 
definition of L it is clear that L is real so that only the real part of this expression 
need be integrated. The only nonvanishing contributions to the integral occur 
when 


(7) 2+k+1=0, 2n 


where n is an integer. The result is that 


(8) L = 2x) 5-0 D nwo (— 1)" ar’) Ton4.2j(br) In (cr). 


To reduce equation (8) to a single summation the following special case of 
an addition theorem given in Watson [7] is applied to the summation over n; 


(9) In(VZ? + 2) cos np = amen —1)Tngor(Z)Io(z), tan p = 2/Z. 
The result leads directly to equation (2). 


REFERENCES 
| F. N. Frenxie., ‘Frequency distributions of velocities in turbulent flow,’”’ /. Meteorol., 
Vol. 8, (1951), pp. 316-320 
| J. L. Lawson ano G. E. Unvenseck, Threshold Signals, MeGraw-Hill Book Co., (1950), 
p. 195 
3| P. B. Parnaik, “The non-central x? and F-distributions and their applications,’’ Bio- 
metrika, Vol. 36 (1949), pp. 202-232 
| Herpert Ropsins ann E. J. G. Prrman, ‘‘Application of the method of mixtures to 
quadratic forms in normal variates,’’ Ann. Math. Stat., Vol. 20 (1949), pp. 552 
560 
Wasao Sarpacaki, “Tables of the Modified Bessel Functions,’’ Kyushi University 
Physics Department, 1946 
P. C. Tana, “The power function of the analysis of variance tests with tables and il 
tustrations of their use,’’ Statistical Research Memoirs, II, University of London, 
1938), pp. 126-149 
7| G. N. Watson, A Treatise on the Theory of Bessel Functions, 2nd ed., Macmillan, 1948, 
Equation 7, p. 361 
Tables of the Bessel-Functions, Part II, Vol. 10, ‘‘Functions of positive integer order,” 
British Association for the Advancement of Science, 1952 


rh 


CORRECTION TO “ON CERTAIN CLASSES OF STATISTICAL 
DECISION PROCEDURES’* 


By H. S. Konrsn 


University of California, Berkeley 


I am indebted to Dr. L. Le Cam for pointing out an error in the above-named 
paper (Annals of Math. Stat., Vol. 24 (1953), pp. 440-448). Let 


Dp” = {6 ¢e:r(F, 5) is bounded by a function of F}. 


* Received 11/16/53 





ABSTRACTS 171 


In Theorem 2 change 2” to ®” (in the proof the subscript of M should be F). 
In the last Corollary add D”’ fa, y(a); —}, delete “D"’ is closed,” and change 
Dla, y(a); —| to D’'fa, y(a); —} with D’ any closed subset of D**” or D**”, 
b* denoting the set of all possible decision procedures (In the proof the first 
paragraph should be deleted, and D” changed to 9"”’.) 

In the penultimate paragraph of Section 2 change 0” to dD and D to 
©’, where 0’ is a closed convex subset of 2° or 9°“ satisfying (v). The 


enumeration of exceptions in the next paragraph should read 
“D,, » Di —; B, 2(8)}, and D’ and its subclasses for D’ C D“’.” 


Se 


ABSTRACTS OF PAPERS 


(Abstracts of papers presented at the Washingion meeting of the 
Institute, December 27-80, 1953) 


Confidence Intervals of Fixed Length for the Poisson Mean and the Dif- 
ference Between Two Poisson Means. ALLAN Birnspaum, Columbia Uni- 
versity. 


1. To construct an estimate ’ of the unknown parameter \ of a Poisson process X(t) 
such that with probability at least 1 — a, | \’ ~ A | S e, where a and «¢ are given positive 
constants, let n be a positive integer. Observe 7’, , the waiting time required for the oc 
currence of n events. Let ¢ = ae /2n. Perform additional observation of the process for 
1/2cT’,, units of time; let X be the number of events observed in this period. Set \’ = 2c7',X. 
2. To construct an estimate A’ of A = Ax — Ay , where A, , A» are the unknown parameters 
of two Poisson processes, such that with probability at least 1 B,| A’ — A| S n, where 
3 and 7 are given positive constants, we may set A’ = de — Ay, where A; and A; are obtained 
as above, taking « = 7/2 and (1 — a)? = 1 — 8. In case the two processes can be observed 
simultaneously, a more efficient estimate can be given. At least for \ exceeding some lower 
bound, ¢ can be replaced by c* = &/k, where Pr |z 2 k} = a if z is the product of inde 
pendent chi-square variates with 1 and 2n d.f. 


2. Convexity Properties of the alpha-beta-set Under Composite Hypotheses. 
Herman RvuBIN AND Oscar Wester, Stanford University. 


Suppose one is presented with a statistical Vecision problem of the following kind. A 
random variable Y is observed and it is desired to test whether the (not necessarily finite 
dimensional) parameter of the distribution of X is in Q or in  . Define, as usual, a(y) 
to be the supremum of the probability of an error of the first kind and B(y) to be the 
supremum of the probability of an error of the second kind when the random decision pro 
cedure ¢ is used. If Q, and Q consist of one point each, it is known that the set S of all 
points (a(g), 8(g)) is convex and symmetric about (4, 4). it is shown that the subset 7’ of 
S lying on or below the line a + 8 = 1 is convex, and that if the set of distributions under 
consideration is dominated by a o-finite measure, the lower boundary of 7’ belongs to 7' 
It is also shown that the symmetric image of 7’, and possibly more, belongs to S. An ex 
ample is given to show that this “‘more” can destroy convexity. 


3. Critical Regions in Terms of Lower Dimensional Critical Regions. L. M. 
Court, Diamond Ordnance Fuze Laboratory. 


Let pi(z| 6) = pila y+, 2nr| 01 ,°°* Om) and poly |@) pals , » Yn | + Dmg) 
be two independent density distributions and p(x, yi) = pilr | 6)pely | ) the joint dis 





172 ABSTRACTS 


tribution formed by their multiplication. Let R, , Rk: , R be critical regions of size a, B, 4 
for pi(x | 6), poly | 0), p(x, y | ¥) respectively. Expressions are derived for RF in terms of 
R, and R, in the cases in which the critical regions are determined by 1) the method of 
maximum likelihood, 2) the likelihood ratio and 3) the Neyman-Pearson theory. 


4. A Stochastic Model of Traffic Congestion. C. B. Winsten, Cowles Com- 
mission, University of Chicago. 


A simplified model is presented to represent the behavior of traffic at an intersection 
controlled by a stop sign or repeated cycle traffic lights. An important property of such 
traffic is that it is spaced out, both on arriving at the intersection and on leaving it. To 
take account of this properly the model is set up with discrete time points at each of which 
at most one car can arrive or leave. For the stop sign case, cars in the minor road wait till 
there is at least a safe interval of w time units till the next car in the major road is due. 
The arrivals in the minor and major roads are taken as binomial, with probabilities a, 6, 
respectively. From a discussion of the queueing process in the minor road, the condition 
for the system to settle down to equilibrium is a < (1 — 8)”, and in this case the mean 
waiting time per car is shown to be [1 (1 + wB)(1 — B)”]/B[(1 — B)* — a]. A method for 
calculating mean waiting time for the equal cycle traffic light problem is also given. 


5. On a Property of Certain Linear Functions of Order Statistics from some 
Normal Populations. K. C. Sea., University of North Carolina. 


Suppose there are n normal populations N (yu; ,o2), i= 1,2, , n, and that one random 
observation from each of these n populations is given. It is not known which population 
any particular observation came from. Let xq) , 2) , -** , Zim) denote the n observations 
written in an increasing order of magnitude. It is shown that the expectation of any linear 
function ¢)2q) + «++ + Cnty Of the zj(i = 1, --- , n) with nonnegative coefficients at 
least one of which is positive, is a monotonically increasing function of each of the popula- 
tion means yi(t = 1, «++ , n). 


6. A Historical Note on the Relation Between Extreme Values and Tensile 
Strength. Jutrus Lresiern, National Bureau of Standards. 


It appears to be commonly believed that the statistical treatment of the ‘weakest 
link’? hypothesis and the use of extreme-value methods in connection with strength of test 
specimens originated with F. T. Peirce in an article published in 1926. The discovery by 
the writer of a pair of long-forgotten articles in two engineering journals of the 1880’s 
shows that we must push the date of first application of extreme values to breaking strength 
back nearly 50 years, at least. These articles show a statistical viewpoint ahead of their 
time, and might well be read by anyone interested in the development of statistical thought 
The articles, both by W. 8. Chaplin, are ‘‘The Relation Between the Tensile Strengths of 
Long and Short Bars,’’ Van Nostrand’s Engineering Magazine, December 1880, and “On 
the Relative Tensile Strengths of Long and Short Bars,’’ Proceedings of the Engineers’ 
Club, 1882 


7. A Note on Statistics and sigma-subfields. R. R. Banapur, Columbia Uni- 
versity. 


Let there be given a Borel set X of m-dimensional Euclidean space (1 S m S ~), and 
let S be the class of Borel measurable subsets of X. If f is a function on X into a set Y, 
let Sy = {f7'(A):A CY, f-'(A) e S}. Then, for each f, S; is a o-subfield of S. It is shown 
in this note that corresponding to any o-subfield S* and any probability measure p on S 
there exists an f (depending in general on p) such that S* = Sy, in the sense that corre 


sponding to each set in one o-subfield there exists a set in the other such that the symmetric 





ABSTRACTS 173 


difference of the two sets is of p-measure zero. This result is obtained by means of the 
theory of sufficiency, and has in turn certain applications in that theory; for example, if 
f is a necessary and sufficient statistic for a dominated set P of measures on S then S;, is 
a necessary and sufficient o-subfield for P. It is shown by an example that the converse of 
the last stated result is false. 


8. Completeness, Similar Regions, and Unbiased Estimation, Part II. (Pre- 
liminary Report.) E. L. LeamMann anp Henry Scuerrf, University of 
California. Berkeley. 


Continuativ. of Part I (Sankhyd, Vol. 10 (1950), pp. 305-340) to obtain theorems about 
the generation of complete families of measures from other complete families, application 
of these results (i) to the Pitman-Koopmans-Darmois family to prove certain tests con- 
cerning this family uniformly most powerful unbiased, and (ii) to some nonparametric 
problems. 


9. On Linear Regression Analysis when the Dependent Variable is Rectangular. 
E. G. O_ps, Carnegie Institute of Technology. 


Assume chance variables y; , rectangular on a + 8(z; — 2) + c, with ¢ known but with 
a and 8 unknown. Values of y; are observed for a fixed set of z-values and an estimator 
a + b(x; — #) is to be determined. In general, the maximum likelihood estimator is not 
unique. Furthermore, the method of least squares sometimes yields an estimator which 
might be called inadmissible since, for one or more values of z the value of the estimator 
differs from the observed value of y by more than c. The present paper gives a convenient 
method of obtaining a modified least squares’ estimator which is admissible in the sense 
implied above. The proposed estimator belongs to the convex set of maximum likelihood 
estimators and is unbiased. Also included in the paper is a method for finding the largest 
and smallest admissible estimates corresponding to any specified value of z. 


10. Some Further Results in Simultaneous Confidence Interval Estimation. 
S. N. Roy, University of North Carolina. 


In continuation of previous work in this line (8. N. Roy and R. C. Bose, “Simultaneous 
Confidence Interval Estimation,’’ Ann. Math. Stat., Vol. 24 (1953), pp. 513-536) simul 
taneous confidence bounds have been obtained for each of the following sets: (1) (a) ele- 
mentary symmetric functions of the characteristic roots of the covariance matrix ~ for 
one multi-variate normal population, (b) the same for the matrix 2, Zyz' for two multi- 
variate normal populations and (2) certain simple functions of the canonical regressions 
between two subsets of a multivariate normal set. It has been possible to obtain the bounds 
for 1(a) in terms of similar functions of the characteristic roots of the sample matrix S, 
for 1 (b) in terms of similar functions based on S,S3° , and for (2) in terms of sample ca- 
nonical regressions. 


11. A Method for Generating Random Variates on Electronic Computers. 
D. Tercuroew, National Bureau of Standards. 


Some of the methods by which values of random variates are obtained for use in high 
speed automatically-sequenced computers are: (i) using previously computed values, such 
as tables of normal deviates, (ii) determining the value for which the probability integral 
has a given (random) value, and (iii) accepting or rejecting random values on the basis of 
other random values, for example, a random value, z, may be accepted if another random 
value is less than the value of the density function at the point z. In many cases the most 
serious objection to these methods is that they take too much computing time. A method 
is proposed which appears useful for very fast computers with a relatively small amount 





174 ABSTRACTS 


of storage. The method consists of two computations. The machine first computes a random 
variate y and then transforms y into z, a variate with the desired distribution. Computa- 
tion time is minimized by selecting a y which can be computed quickly and which also 
permits the transformation to have a simple form 


12. Theory of Successive Multiphase Sampling. (Preliminary Report.) B. D. 
TrKkkriwaL, Columbia University. 


Suppose there are k characters of an infinite population under consideration and the 
ith (¢ = 2, --+ , k) character is always studied on the part of the sample taken for the 
(t — 1)st character on each of the occasions observed up to a certain period. A best estimate 
and its variance for each of the k characters on each occasion were obtained under a certain 
pattern of correlation between the various characters occurring on the various occasions, 
[paper read by the author before the annual meeting of the Indian Society of Agricultural 
Statistics, 1951]. Now it has been further noted that if the infinite population studied on 
each of the successive occasions is replaced by a finite population of size N, the best esti 
mates for the various k characters remain the same. However, the variance of the best 
estimate is decreased in each case by a quantity o?/N where o? is the population variance 
on the Ath occasion of the character under consideration. Thus in particular it can be 
noted that the best estimate given by Patterson |J. Roy. Stat. Soc. Suppl., Vol. 12 (1950), 
pp. 241-255] for the study of one character remains unaffected, when the infinite population 
is replaced by a finite population of size NV, but its variance is decreased by o?/N 


13. Estimation of the Size of a Stratified Population. Dovucias G. CHAPMAN 
AND C. O. JunGer, Jr., University of Washington and Washington State 
Department of Fisheries. 


The estimation, by marking methods, of the size of a population which has a variable 
stratification, is studied. It is noted that no unbiased estimate of this parameter exists. 
Conditions are found under which the standard estimate (which is constructed without 
reference to the stratification) and an estimate given by Schaeffer are consistent. A new 
estimate is obtained, which is consistent not only under wider conditions, which are more 
likely to be fulfilled in actual marking experiments, but also under different sets of as- 
sumptions. The asymptotic variance of this estimate is derived. Tests are suggested for 
determining which of these estimates should be used. The results may also be adapted to 
evaluating the stratification changes that occur within a population. 


14. Sequential Life Tests in the Exponential Case. Bensamin Epstein aNnp 
Mitton Sopeit, Wayne University and Cornell University. 

In this paper a sequential life test procedure is worked out in detail. As in previous work 
devoted to nonsequential methods, it is assumed that the underlying p.d.f. is exponential. 
An interesting feature of the test is that decisions are made continuously in time. Various 
useful formulae and tables are given. (Sponsored in part by the Office of Naval Research 
and the Office of Ordnance Research of the U. 8. Army.) 


15. Distributions of Some Integrals of Certain Gaussian Stochastic Processes 
and the Limiting Distributions of Some “Goodness of Fit’’ Criteria. T. W. 


ANDERSON, Columbia University. 
To test the hypothesis that a sample of N observations has been drawn from a popula 
tion with a specified continuous cumulative distribution function F(z) one can compare 
the empirical cumulative distribution Fy(z) with F(z) by means of 


Wy = [0 F(x) }*y(F (2)} dF (2) 





ABSTRACTS 175 


and Vy = ff \Fy(z) — F(x)\[Fe(y) — Fly) VIF (2), F(y)| dx dy. The limiting distribution 
of Vw under the null hypothesis is the distribution of V = ff X(u)X(v)l(u, v) du dv, where 
X(u)(0 S u S 1) is a certain Gaussian stochastic process with mean zero, and the charac- 
teristic function of V is shown to be II (1 — 2it | w;)~4, where yw; are the eigenvalues of 
Sku, w)l(w, u) dw and EX(u)X(w) = k(u, w). The characteristic functions of 7 = 
J [X(u) + k(u)}*y(u) du and S = ff [X(u) + k(u)|[X(v) + k(v)]l(u, v) du dv are shown to 
be the products of the characteristic functions of W = f X*(u)y(u) du and V, respectively, 
and certain exponentials, the exponents being integrals of k(u). If the sample of N is drawn 
from Hy(z) and if Hy(z) approaches F(z) in a certain way, then the limiting distribution 
of Vy is the distribution of 7’. 


16. Some Sampling Results on the Power of Nonparametric Tests Against 
Normal Alternatives. W. J. Dixon anp D. Tercurorw, National Bureau 
of Standards. 


This report contains some of the results of sampling investigations of the distribution 
of several nonparametric two-sample tests under the null hypothesis and under alternative 
normal hypotheses. The sampling was carried out at the University of Oregon in 1949-50 
and at the National Bureau of Standards, Los Angeles, in 1952. In both cases the number 
of samples was not large. The sampling performed, however, is sufficiently extensive to 
give a general indication of the relative power of the different tests and to indicate the 
range of alternatives to be sampled further to gain more precise determination of power 
The tests, listed in order of their power to reject the alternative hypotheses, are: (i) rank 
sum test, (ii) maximum deviation test, (iii) median test, and (iv) run tests. The results 
for the rank sum test indicate that one does not lose much power if the rank sum test is 
used instead of the ¢ test in cases where the distributions are actually normal. 


17. On the Large Sample Power of Rank Order Tests in the Two-Sample 


Problem. Meyer Dwass, Northwestern University. 


Let the random variables X, , --- , Xv be independent, and let R, , --- , Ry be their 
ranks. Let Sy = Zayif(Ri/N), where ay; = +++ = Gym, Gnwm4i = *** = Gyw, Lani = O, 
Layi = 1. Let Ho be the hypotheses that the X; are identically distributed. Let H,(6) be 
the alternative that the X, are independent, but that the first m have one density function 
gi(z, 6), and the remaining N — m have another density function g.(z, 6), where @ is a 
one-dimensional parameter and where g:(z, 0) = g2(z, 0). The following is shown subject 
to certain regularity conditions. 1) If f is a polynomial, Sy is asymptotically normal when 
Ho is true and when H;(@) is true. 2) Consider the test which rejects Hy when Sy is too 
large. For f a polynomial, we approximate the large sample power against H,(0) for such 
tests whose significance levels approach a as N — ~, in the following sense: c is determined 
such that the power differs from 1 — #(A — 64/Nc) by less than any preassigned « > 0 for 
N sufficiently large, where ® is the normal (0, 1) ¢.d.f. and 1 — (A) = a. 3) It is shown 
how to choose that polynomial of a given order which maximizes the large sample power. 
A polynomial, f can be chosen of sufficiently high order so that the large sample power of 
the test based upon Sy is arbitrarily close to the large sample power of the classical likeli- 
hood ratio test. 


18. Multiple Tests and Intersection Region Procedures. Davin L. WaALtuace, 
Massachusetts Institute .f Technology. 


A statistical hypothesis is often expressed as the logical intersection of several compo- 
nent hypotheses. If tests of the component hypotheses are available, a natural test for the 
full hypothesis is defined to reject whenever one or more of the component hypotheses are 
rejected. If an indexed family of such hypotheses is tested, confidence regions for the index 
are obtained from each of the families of component tests. The region defined by the family 





176 ABSTRACTS 


of multiple tests is the intersection of the component regions. Properties of the multiple 


test and intersection region are discussed. Of special interest is the pth order general 
linear hypothesis with its p single linear component hypotheses. Intersection region pro 
cedures are useful in obtaining ‘“‘usable’’ confidence regions for the location of the vertex 
in quadratic regression. 


19. Some Significance Test Procedures for Multiple Comparisons. H. 0. 
Hartey, Iowa State College. 


In the comparison of k experimental means arising from an ‘Analysis of Variance’ of 
data one of the procedures for deciding on the significance of all the 4k(k — 1) differences 
is the so-called ‘Newman-Keuls’ procedure. Some properties concerning the error of the 
first kind and of the power involved in this procedure are proved. A similar sequential 
procedure for testing the significance of k mean squares is then suggested. This is based 
on the distribution of the largest of k F-ratios obtained from k treatment mean squares 
si , respectively based on »; degrees of freedom and all divided by the same error mean 
square based on » degrees of freedom. This procedure is first developed for the case of 
equal »;, then generalised to differing »; and shown to have properties similar to the 
Newman-Keuls procedure. 


20. A Sequential Test of Randomness Against Linear Trend. Gorrrriep E. 
Noeruer, Boston University. 


Given observations X, , X2,--- , let Z,, = 0 or 1 depending on whether X2,;., is smaller 
or greater than X 9.1)j4,j 2 1,g = 0,1,2,-°-- ,k = 1,2, --- , j. Under the alternative 
hypothesis of the linear trend F(z, 22, «++ , 2m) = Ij F(a; + 10), P(Z,, = 1) = 
P(X, > Xj41) = pi , say, while under the hypothesis of randomness, P(Z,, = 1) = 4. The 
hypothesis of randomness can then be tested by means of the usual sequential procedure 
for testing the hypothesis that for a binomial distribution p = 4 against the alternative 
p = p,. There exists an optimum value of j in the sense that the expected number of 
observations required by the test corresponding to this j is not larger than that for any 
other j. This sequential test is compared with tests of randomness based on runs up and 
down and also with Mann’s 7-test. It turns out that on the average the sequential test 
requires fewer observations than these other tests, at least for sufficiently small values of 
6. (Research sponsored by Air Research and Development Command.) 


21. Further Results in the Theory of Quality Control Charts. Leo A. AroraAn, 
Hughes Research and Development Laboratories. 


The type of decision for a quality control chart, the case of erratic production for two 
or more charts, and a special case of a control chart by attributes versus control charts by 
variables, are investigated. Examples illustrate the theory for the control of the mean and 
the standard deviation for one chart used alone or two charts used together. The theory 
extends results given in a previous paper, ‘“The Effectiveness of Quality Control Charts,”’ 
by L. A. Aroian and H. Levene, J. Amer. Stat. Assn., Vol. 45 (1950), pp. 520-529. 


22. Minimum Life in Fatigue. EK. J. GumpBe. anp A. M. FreupentTHat, Colum- 
bia University. 


The probability of survival at N cycles for constant stress leVels S was analysed with 
the help of the asymptotic theory of smallest values of a nonnegative variate. In this case 
two parameters exist, the characteristic number of cycles to failure Vs and a scale parameter 
l/ag . The available fatigue test results for copper, nickel and aluminum specimens at 
high stress levels show that this approach is justified in first approximation assuming that 
the ‘‘minimum life,”’ that is the number of cycles No.s below which no specimen breaks at 
the stress level S, is practically zero. However, test results on the same metals at low 





ABSTRACTS 177 


stress levels and for steel show a significant positive minimum life. Therefore the theory 
is generalized by the introduction of this value as a third parameter. The exponential 
function and four pseudo-symmetrical cases where two averages coincide or the skewness 
is zero, are special cases of this survivorship function. The three parameters are estimated 
by the method of moments which requires the calculation of the sample mean, the standard 
deviation and skewness. The estimate of the scale parameter depends only on the sample 
skewness which has a high degree of variation. The estimates of the characteristic number 
of cycles to failure and the minimum life depend upon all three statistics and are obtained 
without any successive approximations. The theory leads also to an upper bound, a large 
number of cycles at which the probability of survival becomes infinitesimally small. 
Observations on copper, nickel and aluminum at high stress levels and on steel traced on 
the logarithmic extremal probability paper lead to curved survivorship functions which 


are very well reproduced by this theory. (Work sponsored in part by the Office of Ordnance 
Research.) 


23. Bounds for the Distribution Function of a Sum of Independent, Identically 
Distributed Random Variables. Wassi.y HorrrpinG Anp 8. 8. Surik- 
HANDE, University of North Caruiina and University of Nagpur. 


The problem is considered of obtaining bounds for the cumulative distribution function 
of the sum of n independent, identically distributed random variables with k prescribed 
moments and given range. For n = 2 it is shown that the best bounds are attained or 
arbitrarily closely approached with discrete random variables which take on at most 
2k + 2 values. Explicit bounds are obtained for the case of nonnegative random variables 
with given mean when n = 2; for arbitrary values of nm bounds are given which are 
asymptotically best in the “tail’’ of the distribution. Some of the results contribute to the 
more general problem of obtaining bounds for the expected value of a given function of 
independent, identically distributed random variables when the expected values of certain 
functions of the individual variables are given. 


24. Sequential Rank Sum Tests. (Preliminary Report.) Cu1a Kuer Tsao, 
Wayne University. 


Let f(z) be a continuous p.d.f. defined over a space S. To test a simple hypothesis H: 
f(z) = fo(x) against an alternative hypothesis H, : f(x) = fi(z), we divide S into three 
mutually exclusive sets Sp , S; and S, . For i = 0, 1, the set S; is subdivided into k,; subsets 
Sa, Sia, +++ , Siag. Random observations are drawn successively. At each stage, count 
the number of observations falling in each of the ky + k,; + 1 sets. For each m(m = 1, 2, ---) 
denote by m;; the number of observations falling in the set S;; ,j = 1,2,-+-+ , ki ;i = 0,1. 
Let s; = Dit imi (i = 0, 1). Let a and a, be two positive integers. Continue to draw 
observations as long as 8; < a;(i = 0, 1). The experiment is discontinued as soon as 8 2 
dy or 8; 2 a; . The hypothesis H; is accepted if s; 2 a; (¢ = 0 or 1). The ky + hk; + 1 sets 
are determined so that the following four conditions are satisfied: (1) So; : co; S fi(z)/fo(x) S 
Cojnt, j @ 1,2, -°> , ke; (2) Sr cigar Hilfe) S&S os,j = 1, 2, --> , bs 3 B) 
Pr (X ¢ Si; | Ho) = (Pr (X ¢ S; | Ho))/ki ,j = 1,2, --+ ki, t = O, 1; (4) the pair (coo , cro), 
where Coo S Cio , is so determined that the test satisfies certain requirements. In this paper, 
the distribution and the m.g.f. of the sample size, the power function of the test and the 
ASN function are obtained. Other properties are also studied. Applications to the para- 
metric and nonparametric problems are discussed. 


25. A Remark on the Geometrical Method of Construction of an Orthogonal 
Array. Estuer SEIDEN, University of Chicago. 


R. C. Bose and K. A. Bush showed [Ann. Math. Stat., Vol. 23 (1952), pp. 508-524] how 
one can make use of the maximum number of points, no three collinear, in finite projective 





178 NEWS AND NOTICES 


spaces in order to construct orthogonal arrays. In particular, this method enabled them to 
construct an orthogonal array (81, 10, 3, 3). They proved, on the other hand, that in the 
case considered the maximum number of constraints cannot exceed 12 (Theorem 2C). 
Hence they state: ‘‘We do not know whether we can get 11 or 12 constraints in any other 
way.’’ It is shown that no such way exists. 


26. Some Contributions to the Theory of Markov Chains. (Preliminary Re- 
port.) Cyrus Derman, Columbia University. 


Suppose that a collection of particles are moving about independently according to 
probabilities given by a Markov chain with transition matrix P = {p;;} i,j, = 0,1, °°: . 
Let A,(i) denote the number of particles in state i at time n (i, n = 0, 1, ---). Some suf- 
ficient conditions on P and on the distributions of the Ao(i)’s were found such that for 
any set i; , -*- , i, , the joint distribution of A,(i), --- , An(i-) tends as n — ~ to that 
of r independent Poisson distributions. Consider a recurrent Markov chain {X,}. Let 
N,(i) = {the number of r’s such that X, = i forl Sr S nj} i = 0,1, --- . The following 
theorem was proved. If H(n) is any nondecreasing unbounded function and if i and j are 
any two states belonging to the same class, then with probability one the inequality 
| (NaG) — aeyNn(i))/(Nai) + NaG))H(NA@ + Na(j)) | > bis, where ai; and bi; are 
certain constants, will be satisfied for infinitely many or at most finitely many n according 
oe See H(n) exp (—H*(n)/2)/n diverges or converges. Sufficient conditions were given 
such that for any i, , --- , i, the distribution of N,(i,), --- , Na(i-) properly normalized 
approaches a multivariate normal distribution. The asymptotic covariance matrix was 
computed. 


27. Minimax Invariant Procedures for Estimating Cumulative Distribution 
Functions. Om P. AGGARWAL, University of Washington. 


Let 2, < 22 < +++ < a, be the ordered observations on a chance variable with cumulative 
distribution function F, Let F denote an estimate of F based only upon the sample. The 


minimax invariant procedures of estimating F are obtained for two classes of loss functions 
LiF, P). For L(F, FP) = Ps | F(x) — F(a) |" dz, with integer r 2 1, the minimax invariant 
procedure is to estimate F by a step function F(z) = c; for z; S$ 2 < 2j41;j7 = 0,1, -°°+ n, 
where 2 and 2,4, denote —* and + and c; is obtained as the root of an equation of 
degree (n + r) when r is odd and of degree (r — 1) when r is even. For the special case 
r = 1 the value of c; is the median of a Beta distribution. For r = 2, one obtains c; = 
(j + 1)/(m + 2). For the class of loss functions L(F, P) = a [F(z) — F(x) ]*/ 
F(z)(1 — F(x) dz, one again obtains for minimax invariant procedure step functions with 
c; determined as root of an equation of degree (2k — 1). In particular for k = 1 this optimum 
procedure turns out to be the usual sample cumulative function with c; = j/n. (Work 
supported by the Office of Naval Research.) 


rr 
NEWS AND NOTICES 
Readers are invited to submit to the Secretary of the Institute news items of interest 


Personal Items 


F. J. Anscombe of Cambridge, England has been appointed Research Asso- 
ciate in the Department of Mathematics, Princeton University, for the year 
1953-1954. 





NEWS AND NOTICES 179 


Dr. R. R. Bahadur, formerly Professor of Statistics at the Indian Council of 
Agricultural Research, New Delhi, is now Visiting Assistant Professor in the 
Department of Mathematical Statistics, Columbia University. 

Dr. Arnold Binder has accepted an Assistant Professorship in the Department 
of Psychology, Indiana University, Bloomington, Indiana. 

Dorothy Brady has been appointed Chief of the Division of Prices and Cost of 
Living of the Bureau of Labor Statistics. 

Dr. Kermit G. Clemens, formerly instructor in mathematics at the University 
of Oregon, is now a statistician at the Naval Ordnance Test Station, Inyokern, 
China Lake, California. 

Dr. 8. G. Ghurye is an Assistant Professor in the Department of Mathematics 
at the University of Oregon, Eugene, Oregon for the current academic year. 

Professor Leo A. Goodman formerly Assistant Professor of Statistics and 
Sociology at the University of Chicago, has been promoted to Associate Professor 
and will be on a leave of absence for the academic year 1953-54, working at 
Cambridge University, Faculty of Mathematics, Statistical Laboratory, St. 
Andrew’s Hill, Cambridge, England as a Research Fellow under the Fulbright 
Program and as an Honorary Research Training Fellow of the Social Science 
Research Council. 

E. J. Gumbel has been appointed Adjunct Professor of Industrial Engineering 
at Columbia University, New York. During the past summer term he was teach- 
ing mathematical statistics in the Free University of Berlin. 

Messrs. Leo J. Tick and Leon H. Herbach, formerly of Columbia University 
and Brooklyn College, have joined the Staff of the Research Division, College 
of Engineering, New York University as Research Associates. They will be part 
of a newly formed statistical research and consulting group, which does contract 
research in applied statistics and probability, and assists other units in the Re- 
search Division with the statistical aspects of their research. 

Daniel G. Horvitz, formerly Assistant Professor, Department of Biostatistics, 
Graduate School of Public Health, University of Pittsburgh, has accepted a 
position as Associate Professor in the Department of Experimental Statistics, 
North Carolina State College. 

Dr. A. T. James, who has been at Princeton University for the past three 
years, has now rejoined Dr. E. A. Cornish in the Commonwealth Scientific and 
Industrial Research Organization, Section of Mathematical Statistics, Uni- 
versity of Adelaide, South Australia. 

Wharton F. Keppler, formerly SQC Specialist at Central Air Procurement 
District in Detroit, Michigan, is now employed as an Analytical Statistician in 
the U. S. Naval Ordnance Test Station, Assessment Division, Test Design & 
Evaluation Branch, China Lake, California. 

Dr. Salem H. Khamis of the Statistical Office of the United Nations has ac- 
cepted an Associate Professorship at the Economic Research Institute, American 
University of Beirut, Beirut, Lebanon. 

Fred E. Kindig, formerly with the Westinghouse Electric Corporation, East 





180 NEWS AND NOTICES 


Pittsburgh, Pennsylvania, is now Director of Quality Control with the Phoenix 
Glass Company, Monaco, Pennsylvania. 

Associate Professor Frank Massey is on leave of absence from the University 
of Oregon for the academic year 1953-54. He has a Ford Fellowship and plans to 
spend most of the year at Harvard. 

L. F. Nanni, who has been an Instructor of Mathematics at Rutgers Uni- 
versity, is now associated with the Engineering Department in the category of 
an Assistant Professor where he is doing research in Communication Statistics. 

Peter K. Newman, who has been a Research Associate at Stanford University, 
has returned to the University of Oxford, London, England. 

Woodrow W. Page is presently employed as a Senior Engineer in Aerophysics 
Department of the Consolidated Vultee Aircraft Corporation, Fort Worth, 
Texas. 

Stefan Peters, formerly Associate Professor of Insurance at the University of 
California, Berkeley, is now Actuary of Connell, Price & Company, Consulting 
Actuaries, Boston, Massachusetts. 

A. E. Sarhan, who has been studying at Harvard University, is going to work 
on his Ph.D. under the supervision of Professor B. G. Greenberg, University of 
North Carolina. 

Dr. 8. 8. Shrikhande has returned to his permanent position as Assistant 
Professor of Mathematics, College of Science, Nagpur, India. 

A. J. F. Siegert from the Department of Physics, Northwestern University, 
has received a Guggenheim Fellowship for research in theoretical physics and 
will be at the Institute for Advanced Study, Princeton, New Jersey during the 
academic year 1953-54. 

Paul B. Simpson. formerly Associate Professor of Economics, University of 
Oregon, Eugene, Oregon, is now acting Chief, Business Finance and Capital 
Markets Section, Division of Research and Statistics, Board of Governors, 
Federal Reserve System, Washington, D. C. 

Dr. Paul N. Somerville, who has been doing graduate work at the University 
of North Carolina at Chapel Hill, has been appointed Associate Professor in the 
Department of Statistics and Statistical Laboratory, Virginia Polytechnic In- 
stitute. 

Fred L. Strodtbeck, formerly of Yale University, has accepted a position as 
Associate Professor in the Law School and in the Department of Sociology, Uni- 
versity of Chicago. As a member of the Ford Foundation Law-Behavioral Sci- 


ences Research Project, he is conducting experimental studies of jury behavior 
in addition to other duties. 


Seiji Sugihara is now employed as a Statistician at the Office of the Naval 
Inspector of Ordnance, Rochester, New York. 

W. A. Thompson, who has been a student at the University of North Carolina 
for the past three years, is now doing research work at Virginia Polytechnic 
Institute. 

John 8. White has accepted a position as Assistant Professor in the Department 





NEWS AND NOTICES 181 


of Actuarial Mathematics and Statistics at the University of Manitoba, Winni- 
peg, Canada. 


John Woodward has accepted a position as Analytical Statistician in the 
Surveillance Laboratory, BRL, Aberdeen Proving Ground, Maryland. 


a 


New Members 


The following persons have been elected to membership in the Institute 


August 19, 1953 to November 16, 1953 


Abrams, Israel J., M.A. (Univ. of California), Graduate Research Statistician, Department 
of Mathematics, Statistical Laboratory, University of California, Berkeley, Califor- 
nia, 2216 Carleton Street, Berkeley 4, California. 

Bashara, Nicholas M., B.S. (Nebraska Univ.), Physicist, Minnesota Mining and Manu 
facturing, 367 Grove Street, St. Paul, Minnesota. 

Brunk, Hugh D., Ph.D. (Rice Inst.), Associate Professor of Mathematics, University of 
Missouri, 809 Sunset Lane, Columbia, Missouri. 

Cadwell, James H., M.A. (Cambridge Univ.), Principal Scientific Officer, Ministry of 
Supply, 102 Station Crescent, Ashford, Middlesex, England. 

Doynow, David N., M.S. (North Carolina State College), Statistician, Bell Telephone 
Laboratories Inc., 2911 Barnes Avenue, New York City 67, New York. 

Ferguson, Thomas S., A.B. (Univ. of California), Graduate Research Statistician, Statis- 
tical Laboratory, University of California, Berkeley, California. 

Folkert, Jay E., M.A. (Univ. of Michigan), Associate Professor of Mathematics, Hope Col- 
lege, Holland, Michigan, 202 W. 15th Street, Holland, Michigan. 

Garner, Norman R., M.S. (North Carolina State College), Analytical Statistician, Thiokol 
Corporation, Redstone Division, Huntsville, Alabama, 43 Bide-A-Wee Drive, Hunts 
ville, Alabama. 

Golomski, William A., M.S. (Marquette Univ.), Instructor in Mathematics, Marquette 
University, Milwaukee 3, Wisconsin. 

Haight, Frank Avery, M.S. (Iowa), Senior Lecturer in Mathematics, Auckland University 
College, Auckland, New Zealand. 

Horton, William H., M.S. (Oklahoma A. & M. College), Statistician, Metallurgical Develop- 
ment Section, Materials Division (K-90), Westinghouse Electric Corporation, East 
Pittsburgh, Pennsylvania. 

Kennard, Robert W., M.S. (Univ. of Delaware), Teaching Assistant, Carnegie Institute of 
Technology, Graduate Student in Mathematics, 149 S. Negley Avenue, Pitisburgh 6, 
Pennsylvania. 

Laine, Roland O., A.B. (George Washington Univ.), Analyst, Civil Service, Department of 
Defense, 876 North Kensington Street, Arlington &, Virginia. 

Moore, Peter G., B.S. (London Univ.), Assistant Lecturer in Statistics, University College, 
London, Department of Statistics, University College, Gower Street, London W.C.1, 
England (Temporary address until June 1, 1954) 185 Graduate School, Princeton Uni- 
versity, Princeton, New Jersey. 

Moore, Roger A., M.A. (Univ. of California), Statistician, Systems and Tactics Group, 
Engineering Department, North American Aviation, Inc., Los Angeles, California, 
907 W. Hillerest Boulevard, Apartment #2, Inglewood, California. 

Moranda, Paul B., Ph.D. (Ohio State Univ.), Research Engineer, North American Aviation 
Company, Downey, California, 4442 Adenmoor, Long Beach 8, California. 

Moss, Norman R., B.S. (Univ. of Pittsburgh), Quality Control Statistician, Firth Sterling, 
Inc., Pittsburgh 30, Pennsylvania, 250 Maryellen Street, Fast McKeesport, Pennsylvania 





182 NEW AND NOTICES 


Stapleton, James H., B.A. (Michigan State Normal College), Research Assistant, Purdue 
University, 103 W. Stadium, West Lafayette, Indiana. 

Sweeny, Hale C., M.S. (Virginia Polytechnic Inst.), Assistant Professor, Department of 
Statistics, Virginia Polytechnic Institute, Box 646, Blacksburg, Virginia. 

Wilkes, John D., Ph.D. (California Tech.), Research Coordinator, Office of Naval Research 
Branch Office, San Francisco, California, 901 California Street, San Francisco, California. 

Zumbado B., Fernando, M.A. (Univ. of Michigan), Actuary and Statistician, Instituto 
Nacional de Seguros, San José, Costa Rica and Professor of Mathematics and Statisties, 
University of Costa Rica, Bor 428, San José, Costa Rica, Central America. 

Zyskind, George, B.S. with honors (McGill Univ.), Graduate Student and Teaching Fellow 
in Mathematics, 14, Borden Street, Toronto, Ontario, Canada, 


0 


Positions at Iowa State College 


A number of graduate assistantships and associateships in statistics are avail- 
able for the 1954-55 academic year at lowa State College for work connected 
with experimental design, theory and sample survey research. These include 
several Research and Development assistantships supported by the Iowa Agri- 
cultural Experiment Station to train students as statistical consultants in agri- 
cultural and biological areas. Assistantships are for 9 or 12 months and pay $125 
a month. A variable number of available positions will be supported by current 
contractual research projects. Also there are openings for several assistants or 
associates (instructors) on the teaching budget of the Department of Statistics. 
Associates on half-time appointments for 12 months receive $1800; those on full- 
time, $3600. For more information, write to the Statistical Laboratory, Iowa 
State College. 


or 


Summer Sessions at Iowa State College 


Statistics courses will be offered at Iowa State College during the two six- 
weeks sessions of summer quarter 1954, June 14—August 27, for the 29th consecu- 
tive year. This summer’s offerings of the Department of Statistics are designed 
primarily for graduate students minoring in statistics and majoring in some 
other substantive subject-matter area, and for beginning graduate students 
majoring in statistics who may find it necessary to satisfy prerequisite require- 
ments for the more advanced courses in statistics. 

Senior staff members of the Statistical Laboratory and the Department of 
Statistics are available at least part of the summer quarter for consultations on 
special problems courses in statistics and on research connected with graduate 
theses. 

In addition, during the first six-weeks session (June 14—July 21), the following 
will be offered: Statistical Methods for Research Workers, Experimental Designs 
for Research Workers, Statistical Theory for Research Workers. During the 
second session (July 21—August 27), the following will be offered: a continuation 
of the two courses, Statistical Methods for Research Workers and Statistical 
Theory for Research Workers, and Survey Designs for Research Workers. 





REPORT OF THE WASHINGTON MEETING 


Summer Sessions at Berkley, California 


This year’s program at the Statistical Laboratory of the University of Cali- 
fornia, Berkeley, California, consists of two sessions: June 21—July 31 and August 
2-September 11, 1954. The faculty of the summer sessions will include Professor 
C. R. Rao of Presidency College, Calcutta; Professor J. Neyman, Professor 
Harry M. Hughes and Professor Terry A. Jeeves of the Statistical Laboratory, 
University of California. 

The program includes two of the usual undergraduate courses in each session. 
In addition Professor Rao will give a new course designed to acquaint students 
with multivariate analysis, including analysis of variance and covariance, factor 
analysis, and discriminant functions. Professor Neyman will be available for 
consultations on work leading to higher degrees. Further information may be 
obtained by writing the Statistical Laboratory, 5416 Dwinelle Hall, University 
of California, Berkeley 4, California. 


ro 


REPORT OF THE WASHINGTON MEETING OF THE 
INSTITUTE OF MATHEMATICAL STATISTICS 


The fifty-eighth meeting and the sixteenth annual meeting of the Institute 
of Mathematical Statistics was held in Washington, D. C., on December 27-30, 
1953, in conjunction with meetings of the American Statistical Association, 
“conometric Society, Biometric Society, and other social science societies. 

The following 363 members of the Institute attended the meeting: 

Helen Abbey, Forman 8. Acton, Beatrice Aitchison, Sigurd L. Anderson, R. 
L. Anderson, Walter J. Angulo, Francis J. Anscombe, Kenneth J. Arnold, Leo 
Avedis Aroian, R. R. Bahadur, Edward W. Bailey, John C. Bain, Alfonso T. 
Bancroft, Edward W. Barankin, James B. Bartoo, N. M. Bashara, Geoffrey 
Beall, Robert E. Bechhofer, J. N. Berbettoni, Agnes P. Berger, Joseph Berkson, 
Charles A. Bicking, Patrick P. Billingsley, Allan Birnbaum, Z. William Birn- 
baum, David Blackwell, Archie Blake, Julius R. Blum, Isadore Blumen, Colin 
R. Blyth, John Boddie, Nathan B. Borden, Paul Boschan, Albert H. Bowker, 
Ralph A. Bradley, Alva Esmond Brandt, Maurice F. Bresnahan, Robert L. 
Brickley, Glenn W. Brier, Irwin D. J. Bross, Bernice Brown, Richard H. Brown, 
K. Alexander Brownlee, Robert W. Burgess, R. 8. Burlington, Irving W. Burr, 
Joseph M. Cameron, Edward W. Cannon, Gerald T. Cargo, Douglas G. Chap- 
man, Jack Chassan, Yunien Chen, Herman Chernoff, Willard H. Clatworthy, 
William G. Cochran, A. C. Cohen, Jr., Samuel E. Cohen, William 8. Connor, 
Clyde H. Coombs, Ellsworth B. Cook, Arthur H. Copeland, Sr., Louis J. Cote, 
L. M. Court, John H. Cover, David Cowan, Dudley J. Cowden, Edwin L. Cox, 
Cecil C. Craig, Harald Cramér, Marcus T. Crapsey, Jean Crickett, Edwin L. 
Crow, Lee Crump, John H. Curtis, Joseph F. Daly, Cuthbert Daniel, Donald 
A. Darling, Reed B. Dawson, Jr., Besse B. Day, Walter L. Deemer, D. George 
Deihl, Francis R. Del Priore, W. Edwards Deming, Lucille Derrick, Wilfrid J, 
Dixon, James L. Dolby, Tom G. Donnelly, Robert Dorfman, Harold F. Dorn. 





184 REPORT OF THE WASHINGTON MEETING 


Acheson J. Dunean, David B. Duncan, C. W. Dunnett, David Durand, Arthur 
M. Dutton, Meyer Dwass, Churchill Eisenhart, Daniel R. Embody, Benjamin 
Epstein, Walter T. Federer, Alvin V. Fend, Robert Ferber, Clarence B. Fine, 
Carl H. Fischer, Evelyn Fix, John C. Flanagan, J. Sutherland Frame, Lester R. 
Frankel, David Frazier, John E. Freund, Holly C. Fryer, W. R. Gaffey, Norman 
R. Garner, John E. Garrett, Donald P. Gaver, Lincoln J. Gerende, Dorothy M. 
Gilford, Leon Gilford, John W. Gilmore, Meyer A. Girshick, William A. Golom- 
ski, Mina H. Gourary, Samuel W. Greenhouse, Joseph A. Greenwood, Evelyn 
Grossman, Harold Gulliksen, Emil J. Gumbel, Lee S. Gunlogson, John Gurland, 
Margaret Gurney, Raef K. Haddad, Robert John Hader, John Hagan, Keet W. 
Halbert, K. David C. Haley, Max Halperin, James F. Hannan, Morris H. Han- 
sen, Robert H. Hanson, Bernard Harris, Theodore E. Harris, Boyd Harshbarger, 
Herman Otto Hartley, Willis LeRoy Hasty, Leon H. Herbach, G. Ronald Herd, 
Warren M. Hirsch, Williston C. Hobbs, Joseph L. Hodges, Jr., R. G. Hoffmann, 
Paul G. Homeyer, Robert Hooke, J. Burke Horton, Daniel G. Horvitz, Harold 
Hotelling, Earl E. Houseman, Hendrik 8. Houthakker, William G. Howard, 
Elvin A. Hoy, John Paul Hoyt, Harry M. Hughes, Helen M. Humes, J. Stuart 
Hunter, David V. Huntsberger, Cuthbert Hurd, Eric R. Immel, Paul FE. Irick, 
Walter W. Jacobs, J. Edward Jackson, Terry A. Jeeves, Howard L. Jones, Wayne 
H. Jones, Alice 8. Kaitz, Leo Katz, J. Kiefer, Allyn W. Kimball, Edgar P. King, 
Tjalling C. Koopmans, Carl F. Kossack, Richard L. Kozelka, Kenneth H. 
Kramer, William H. Kruskal, Roy R. Kuebler, Jr., Morton Kupperman, Badrig 
M. Kurkjian, George M. Kuznets, Robert Boyd Ladd, Jack Laderman, Roland 
O. Laine, Donald E. Lamphiear, Otis E. Lancaster, Lucien M. LeCam, Erich 
L. Lehmann, Werner R. Leimbacher, William A. Lesansky, Howard Levene, 
Joseph Levi, Edward Abraham Lew, Benjamin Libstein, Gerald J. Lieberman, 
Gilbert Lieberman, Julius Lieblein, Rensis Likert, Richard F. Link, Stuart P. 
Lloyd, Michel Loéve, Frederic M. Lord, Fred W. Lott, Ardie Lubin, Eugene 
Lukacs, George F. Lunger, Ralph L. Madison, William G. Madow, Clifford J. 
Maloney, Benjamin Malzberg, John Mandel, Charles L. Marks, Eli 8. Marks, 
A. 8. Marthens, Robert H. Matthias, Philip J. McCarthy, Brockway McMillan, 
Robert G. McMillan, Gertrude A. McQuaid, Paul Meier, Margaret Merrell, 
Herbert A. Meyer, Max R. Mickay, Jr., Max F. Millikar, Albert Mindlin, 
Robert Mirsky, Sutton Monro, William J. Moonan, Cordell B. Moore, Marjorie 
E. Moore, Peter G. Moore, Milton Morrison, J. E. Morton, Lincoln EF. Moses, 
Bruce D. Mudgett, Ikhtiar-U] Mulk, Luis F. Nanni, Raymond Nassimbene, 
Mary G. Natrella, Morton J. Netzorg, Paul M. Neurath, Russell T. Nichols, 
Harold Nisselson, Gottfried E. Noether, Monroe L. Norden, James A. Norton, 
Jr., Edwin G. Olds, Ingram Olkin, Ernest L. Osborne, James G. Osborne, D. B. 
Owen, Toby Oxtoby, William R. Pabst, Jr., Woodrow W. Page, Emanuel Parzen, 
Allan E. Paull, Edward Paulson, Ken-Chen Peng, George W. Petrie, Bernard 
E. Phillips, Lillian H. Phillips, Eugene W. Pike, K. C. Sreedharan Pillai, James 
H. Powell, Don C. Price, Frank Proschan, James A. Rafferty, Howard Raiffa, 
Lili Knudsen Randolph, P. H. Randolph, Lowell J. Reed, Mina Rees, Albert 
T. Reid, Joseph 8. Rhodes, Herbert Robbins, Robert Roeloffs, Harry G. Romig, 





REPORT OF THE WASHINGTON MEETING 185 


J. H. Roseboom, Harry M. Rosenblatt, John R. Rosenblatt, Irving Roshwalb, 
Michael B. Rowan, 8. N. Roy, Ernest Rubin, Herman Rubin, David Rubinstein, 
Evelyn L. Rumer, Charles H. Rust, Rose Sachs, Marion M. Sandomire, I. 
Richard Savage, Leonard J. Savage, Mary Ann Savas, Henry Scheffé, Sylvia 
Schlachter, Samuel A. Schmitt, Marvin A. Schneiderman, Ellsworth B. Shank, 
Donald J. Shaw, Richard H. Shaw, David Sheppard, Irving H. Siegel, Jack 
Silver, Walt R. Simmons, Monroe G. Sirken, Rosedith Sitgreaves, Clarence D. 
Smith, H. Fairfield Smith, John H. Smith, Robert T. Smith, III, Milton Sobel, 
Anson Solem, Herbert Solomon, Paul N. Somerville, Dudley E. South, Mortimer 
Spiegelman, Ralph B. Stauber, Charles Stein, Joseph Steinberg, Rothwell 
Stephens, Fred L. Strodtbeck, R. M. Sundrum, Hale C. Sweeney, Nancy A. 
Simens, Z. Szatrowski, William F. Taylor, Henry Teicher, D. Teichroew, Ben- 
jamin J. Tepping, Donovan J. Thompson, W. R. Thompson, Leo J. Tick, Fred 
H. Tingey, G. Tintner, Malcolm W. W. Tomlinson, Donald R. Truax, Albert 
W. Tucker, John W. Tukey, C. R. M. Tuttle, M. C. Kenneth Tweedie, George 
W. Tyler, Jose Vergara, David F. Votaw, Jr., F. M. Wadley, Helen M. Walker, 
David L. Wallace, V7. Allen Wallis, Lionel Weiss, Samuel Weiss, Everett L. 
Welker, Phillips Whidden, Alfred G. Whitney, Frank Wilcoxon, Samuel 5. 
Wilks, Max A. Woodbury, Charles A. Wright, Jacob Yerushalmy, W. J. Youden, 
Marvin Zelen. 
The Program follows: 


SUNDAY, DECEMBER 27, 1953 


10:30 a.m. Logical Foundations of Probability Theory and Statistical 
Inference 


Chairman: T. A. Bancroft, Iowa State College 
Papers: 1. Inductive Approach, Gerhard Tintner, Iowa State College 
2. Set Theory Approach, M. Loéve, University of California 
3. Frequency Approach, A. Copeland, University of Mich- 
igan 
Discussion : F. J. Anscombe, Cambridge, England and J. W. Tukey, 
Princeton University 


2:00 p.m. Stochastic Processes I. 


Chairman: D. Blackwell, Howard University 
Papers: 1. Stochastic Learning Theory, M. M. Flood, Columbia 
University 
2. Interpolatory Stochastic Processes and Some Simple 
Tests, D. Darling, University of Michigan 
3. Statistical Inference in Poisson Processes, Allan Birn- 
baum, Columbia University 
4. Relationship of Certain Learning Models to More General 
Stochastic Processes, T. E. Harris, Rand Corporation 





186 REPORT OF THE WASHINGTON MEETING 


4:00 p.m. Stochastic Processes IT. 


Chairman: T. A. Jeeves, University of California 
Papers: 1. Stochastic Processes Arising in Some Tests of Goodness of 
Fit, J. Kiefer, Cornell University 
2. Information Contained in a Finite Time Interval, Herman 
Rubin, Stanford University 
3. Small Sample Distribution and Bias of Least Squares 
Estimators in a Discrete Markov Process, John Gurland, 
Iowa State College 


1953 Council Meeting 


MONDAY, DECEMBER 238, 1953 
9:00 a.m. Contributed Papers I. 


Chairman: M. Halperin, National Heart Institute 
Papers: 1. Confidence Intervals of Fixed Length for the Poisson 
Mean and the Difference Between Two Poisson Means, 
Allan Birnbaum, Columbia University 
2. Convexity Properties of the Alpha-Beta-Set Under Com- 
posite Hypotheses, Herman Rubin and Oscar Wesler, 
Stanford University 
3. Critical Regions in Terms of Lower Dimensional Critical 
Regions, L. M. Court, Diamond Ordnance Fuze 
Laboratory 
A Stochastic Model of Traffic Congestion, C. B. Winsten, 
Cowles Commission (introduced by H. 8. Hou- 
thakker) 
5. On a Property of Certain Linear Functions of Order Sta- 
tistics from Some Normal Populations. (By title), 
K. C. Seal, University of North Carolina 
A Historical Note ‘on the Relation Between Extreme 
Values and Tensile Strength. (By title), Julius Lieb- 
lein, National Bureau of Standards 
A Note on Statistics and Sigma-Subfields. (By title), 
R. R. Bahadur, Columbia University 


9:00 a.m. Application of Stochastic Methods to Studies of Growth 


(co-sponsored by the American Statistical Association 
and Biometric Society) 


Chairman: H. Fairfield Smith, North Carolina State College 

Papers: 1. Stochastic Processes and Their Application to Growth of 
Populations, Epidemics, and Rumors, A. T. Reid, 
Columbia University 





11:00 a.m. 


Chairman: 


Speaker: 


2:00 p.m. 


Chairman: 


Papers: 


Discussion: 


4:00 p.m. 


Chairman: 


Papers: 


Discussio::: 


Chairman: 
Papers: 


REPORT OF THE WASHINGTON MEETING 187 


2. A Stochastic Model for Selection of Micro-Nuclei in 
Parameciym Growth. (Preliminary Report), A. W. Kim- 
ball and A. 8S. Householder, Oak Ridge National 
Laboratory 


Rietz Lecture (co-sponsored by the Econometric Society) 


M. A. Girshick, Stanford University 
Professor Harald Cramér, On Some Questions Connected 
with Mathematical Risk 


Structural Relations Between Random Variables (co- 
sponsored by Econometric Society) 


T. C. Koopmans, University of Chicago 

1. The General Problem of Linear Structural Relations, 
T. A. Jeeves, University of California 

2. The Problem of Efficiency of Estimates of Structural 
Parameter, C. M. Stein, Stanford University 

Leonid Hurwicz, University of Minnesota and Herman Ru- 

bin, Stanford University 


Survey of Nonparametric Theory and Methods (co- 
sponsored by American Statistical Association) 


Lincoln E. Moses, Stanford University 

1. Characterization of Distribution-Free Statistics, Z. W. 
Birnbaum, University of Washington 

2. A Genesis of Rank Tests and Some Conjectures Based on 
It, E. L. Lehmann, University of California 

3. Optimum Nonparametric Tests for Small Samples, 1. R. 
Savage, National Bureau of Standards 

Meyer Dwass, Northwestern University, Howard Levene, 

Columbia University and R. M. Sundrum, University of 

North Carolina 


Informal Reception (co-sponsored by American Statistical 
Association) 


TUESDAY, DECEMBER 29, 1953 


Preliminary Tests of Significance and Pool Rules (co- 
sponsored by American Statistical Association and Bio- 
metric Society) 


W. G. Cochran, Johns Hopkins University 

1. To Pool or Not to Pool, A. FE. Paull, Abitibi Power and 
Paper Company, Ltd. 

2. Some Applications, T. A. Bancroft, Iowa State College 

3. An Extension, D. Huntsberger, Iowa State College 





188 


Discussion : 


9:00 a.m. 


Chairman: 


Papers: 


Discussion : 


2:00 p.m. 


Chairman: 
Papers: 


4:00 p.m. 


Chairman: 
Papers: 


REPORT OF THE WASHINGTON MEETING 


Robert Bechhofer, Cornell University and J. W. Tukey, 
Princeton University 


The Theory of Organizations (co-sponsored by the Econo- 
metric Society) 

William Vickrey, Columbia University 

1. Aspects of Design in the Theory of Organization, David 
Rosenblatt, American University 

2. Some Factors Affecting Economic Organization, Stanley 
Reiter, Stanford University 

3. The Firm as a Team, Jacob Marschak and Roy Radner, 
Cowles Commission for Research in Economics 

Robert Dorfman, University of California and Merrill M. 

Flood, Columbia University 


Recent Developments in Decision Theory 


Jack Laderman, Office of Naval Research 

1. Explicit Characteristics of Complete Classes for Some 
Testing Problems, Allan Birnbaum, Columbia Uni- 
versity 

2. An Extension of Wald’s Theory of Statistical Decision 
Functions, Lucien LeCam, University of California 

3. Optimum Invariant Decision Functions, M. A. Girshick, 
Stanford University 


4. Point Estimation and the Theory of Decision, L. J. 
Savage, University of Chicago 


Contributed Papers II. 


J. Lieblein, National Bureau of Standards 

1. Completeness, Similar Regions, and Unbiased Estimation, 
Part II, (Preliminary Report.) FE. L. Lehmann and 
Henry Scheffé, University of California, Berkeley. _ 

2. On Linear Regression Analysis when the Dependent 
Variable is Rectangular, E. G. Olds, Carnegie Insti- 
tute of Technology 

3. Some Further Results in Simultaneous Confidence Inter- 
val Estimation, 8. N. Roy, University of North 
Carolina 

4. A Method for Generating Random Variates on Electronic 
Computers, D. Teichroew, National Bureau of Stand- 
ards 

5. Theory of Successive Multiphase Sampling. (Preliminary 
Report), B. D. Tikkiwal, Columbia University (intro- 
duced by M. A. Schneiderman) 





REPOPT OF THE WASHINGTON MEETING 189 


6. Estimation of the Size of a Stratified Population, Douglas 
G. Chapman and C. O. Junge, Jr., University of 
Washington and Washington State Department of 
Fisheries 

7. Sequential Life Tests in the Exponential Case. (By title), 
Benjamin Epstein and Milton Sobel, Wayne Uni- 
versity and Cornell University 

. Distributions of Some Integrals of Certain Gaussian 
Stochastic Processes and the Limiting Distributions of 
Some “Goodness of Fit’? Criteria. (By title), T. W. 
Anderson, Columbia University 

9. Some Sampling Results on the Power of Nonparametric 
Tests Against Normal Alternatives. (By title), W. J. 
Dixon and D. Teichroew, National Bureau of Stand- 
ards 


6:00 p.m. Membership Meeting 


WEDNESDAY, DECEMBER 30, 1953 
8:30 a.m. Contributed Papers III. 


Chairwoman: M. Gurney, Bureau of the Census 
Papers: 1. On the Large Sample Power of Rank Order’ Tests in the 

Two-Sample Problem, Meyer Dwass, Northwestern 
University 

. Multiple Tests and Intersection Region Procedures, 
David L. Wallace, Massachusetts Institute of Tech- 
nology 

. Some Significance Test Procedures for Multiple Com- 
parisons, H. O. Hartley, lowa State College 

. A Sequential Test of Randomness Against Linear Trend, 
Gottfried E. Noether, Boston University 

. Further Resulis in the Theory of Quality Control Charis, 
Leo A. Aroian, Hughes Research and Development 
Laboratories 

. Minimum Life in Fatigue, E. J. Gumbel and A. M. 
Freudenthal, Columbia University 

. Bounds for the Distribution Function of a Sum of Inde- 
pendent, Identically Distributed Random Variables. 
(By title), Wassily Hoeffding and 8. S. Shrikhande, 
University of North Carolina and University of 
Nagpur. 

. Sequential Rank Sum Tests. (Preliminary Report.) 
(By title), Chia Kuei Tsao, Wayne University 

. A Remark on the Geometrical Method of Construction of 





Chairman: 


Papers: 


Discussion: 


10:30 a.m. 


Chairman: 


Papers: 


Discussion: 


10:30 a.m. 


Chairman: 


Papers: 


REPORT OF THE WASHINGTON MEETING 


an Orthogonal Array. (By title), Esther Seiden, Uni- 
versity of Chicago 

10. Some Contributions to the Theory of Markov Chains. 
(Preliminary Report.) (By title), Cyrus Derman, 
Columbia University 

11. Minimax Invariant Procedures for Estimating Cumu- 
lative Distribution Functions. (By title), Om P. 
Aggarwal, University of Washington 


Applications of Survivorship Methods (co-sponsored by 
American Statistical Association, Biometric Society, 
and American Public Health Association) 


Allan Birnbaum, Columbia University 

1. Estimation of Length of Hospital Stay from Discharge 
Data, C. A. Bachrach, Johns Hopkins University 

2. Parametric Estimation of Survivorship, D. J. Davis, 
Rand Corporation 

B. Epstein, Wayne University and P. Densen, University 

of Pittsburgh 


Multiple Decision Problems (co-sponsored by the Ameri- 
can Statistical Association) 

L. Weiss, University of Virginia 

1. On Optimum Slippage Tests for the Variances of k Normal 
Distributions, Donald Truax, Stanford University 

2. A Single-Sample Multiple Decision Procedure for Rank- 
ing Means of Normal Populations with Known Vari- 
ances, Robert Bechhofer, Cornell University 

3. A Two-Sample Multiple Decision Procedure for Ranking 
Means of Normal Populations with a Common Un- 
known Variance, Charles Dunnett, Lederle Labora- 
tories 

4. A Sequential Multiple Decision Procedure for Ranking 
Means of Normal Populations with Known Variances, 
Milton Sobel, Cornell University 

Cuthbert Daniel, New York and J. W. Tukey, Princeton 

University 


Estimation of Rates (co-sponsored by American Statistical 
Association and Biometric Society) 


E. G. Olds, Carnegie Institute of Technology 

1. Estimation of the Interval Rate in Actuarial Calculations: 
A Critioue of the Person-Years Concept, J. Berkson, 
Mayo Cunic 





ANNUAL MEMBERSHIP MEETING 191 


. Improvement Rate of Mental Patients, E. Fix, University 
of California 

. Problem of Within-Family Attack Rates, W. R. Gaffey, 
University of California 

. Consistency of Estimators Under a Specialized Bioassay 
Procedure, W. F. Taylor, School of Aviation Medicine 


2:00 p.m. Survey of the Theory of Finite Sampling (co-sponsored by 
the American Statistical Association) 


Chairman: Morris H. Hansen, Bureau of the Census 
Papers: 1. Aims and Methods, W. G. Cochran, Johns Hopkins 
University 
. Finite Sampling Concepts in Experimental Statistics, 
J. Cornfield, National Institute of Health 
. The Design of Sample Surveys, J. F. Daly, Bureau of the 
Census 
Discussion: H. O. Hartley, Iowa State College and J. W. Tukey, 
Princeton University 


4:00 p.m. 1954 Council Meeting 


HAROLD NISSELSON 
Assistant Secretary 


REPORT OF ANNUAL MEMBERSHIP MEETING, DECEMBER 29, 1953 


The Annual Business Meeting, in the Shoreham Hotel, Washington, D. C., 
December 29, 1953 was called to order at 6:10 P.M. by Morris H. Hansen, Presi- 
dent. 

The President reported on the activities of the Institute during 1953. 

The Secretary-Treasurer submitted his report for the year 1953. 

The Editor submitted his annual report. 

The report of the Program Coordinator was submitted for him by the incoming 
Program Committee. 

The election of a President-Elect for 1954 (President in 1955), and of five 
members of the Council for the term 1954-1956 was held. Later in the meeting 
the tellers announced the election of Henry Scheffé as President-Elect and of T. 
W. Anderson, Jr., Joseph Berkson, Z. W. Birnbaum, D. H. Blackwell, and W. G. 
Madow as members of the Council. 

The Secretary moved, on behalf of the Council, that the present Exception B 
of Article 2 of the Bylaws be deleted. This section reads, “‘Any member may 
make a payment in place of all succeeding annual dues based on a suitable table 
and rate of interest specified by the Council.”” The motion carried. The motion 
having been previously approved by the Council, the section is deleted. 

The President announced that the next items of business related to two peti- 





192 ANNUAL MEMBERSHIP MEETING 


tions from members and to action by the Council relative to these petitions. The 
bodies of the petitions had been mailed to the members and are not repeated 
here. The motions to amend which were part of these petitions appear below. 
The President reviewed the events leading up to the presentation of these mat- 
ters at the membership meeting and had distributed copies of the report to the 
Council from a committee consisting of W. G. Cochran, Chairman, Henry Scheffé 
and §. 8. Wilks. 

The Secretary moved, on behalf of 43 members of the Institute, the adoption 
of the following amendment to the Bylaws: 


‘‘All meetings of the Institute shall be held on the basis of no racial segregation. In par- 
ticular, prior to determining the place of a forthcoming meeting, the Secretary of the 
Institute shall ascertain that meeting halls, eating facilities, and housing accommoda- 
tions adequate for the expected attendance shall be available on a nonsegrateged basis, 
and that all social events connected with the meeting shall be nonsegregated.”’ 


The Secretary then moved, on behalf of 31 members of the Institute, that the 
amendment just introduced be amended by 


“1, deleting ‘eating facilities and housing accommodations’ and substituting ‘and eating 
facilities.’ and 

**2. adding at the end ‘Every effort shall be made to provide nonsegregated housing 
accommodations consistent with the laws of the locality of the forthcoming 
meeting.’ ’’ 


A difference in wording of the first sentence of the proposed bylaw between that 
given in the initial petition and that of the restatement in the second petition has 
not been called into question and is therefore considered not substantive. 

The Secretary then moved, on behalf of the Council, that action on the pro- 
posed amendments be deferred to the September 1954 meeting, under circum- 
stances where every member of the IMS will have adequate opportunity to vote 
by mail or in person after receiving the benefit of study and advice by the Council 
on these amendments. 

W. G. Cochran was recognized by the chair and opened debate on the motion 
to defer. 

The meeting supported the chair in a ruling that voting would not be in prog- 
ress until the vote was called for at the meeting. The meeting reversed the chair 
on a ruling that a motion to hold the meeting open for six weeks was in order. 
This motion was therefore declared out of order. 

After considerable discussion in which many members participated, a move to 
close debate carried by a vote of 114 to 22. 

The motion to defer was carried by a vote of 103 to 28. 

The Secretary then moved the unanimous recommendation of the Council that 
the meeting direct the Council that (1) for the next three years, it shall conduct 
all negotiations in the spirit and letter of the Kingston resolution', and (2) at the 
end of this period, it shall summarize and’ discuss the experience with this policy, 


‘The text of the Kingston resolution is given in the Report of the President, page 193. 





REPORT OF THE PRESIDENT 193 


and shall report to the membership its conclusions and recommendations for ap- 
propriate action, if any. 

A motion to amend by reducing the period specified to ‘until a vote is taken 
by the membership on these amendments” was defeated by a vote of 58 for to 67 
against. 

The motion introduced on behalf of the Council was carried by a large 
majority. 

The meeting adjourned at 8:45 P.M. 

K. J. ARNOLD 
Secretary 


a 


REPORT OF THE PRESIDENT OF THE INSTITUTE FOR 1953 


The work of the Institute has continued with emphasis on the Annals and on 
meetings as the areas of principal service to members and to mathematical sta- 
tistics. These are the primary functions through which research and applications 
are supported and stimulated. Increased interest in mathematical statistics is 
reflected in the growing membership and in the number and quality of the papers. 
Advances in theory and in applications are continuing apace, the role of mathe- 
matical statistics is being extended into new fields, and is achieving wider recog- 
nition. 

The Secretary-Treasurer’s report again reflects a sound financial condition for 
the Institute. This has resulted primarily from steps taken in earlier years. The 
improved financial outlook has made it possible to focus attention now on new 
activities that could not be considered during the period of stress that we experi- 
enced only a few years ago. One result is an amendment to the Bylaws adopted 
at the Kingston meeting, providing for a further reduction of dues for students 
and for members residing outside the continental United States and Canada. 
This further dues reduction for these groups was made in order to promote 
membership and the availability of the Annals in areas where costs, especially 
in terms of dollars, represent a serious problem. Also steps have been initiated 
for publishing a more adequate membership directory. 

Another action arising out of our improved financial condition is the appoint- 
ment of a Finance Committee to advise on the extent to which it is desirable to 
accumulate a surplus, the investment of funds, and other financial matters. In 
addition, a Committee on Activities and Development was recently appointed 
to make recommendations to the Council on any proposed extensions or modifi- 
cations of the activities of the Institute. This Committee, with A. H. Bowker as 
Chairman, will welcome suggestions from the membership. 

At the Kingston meeting the Council passed the following Resolution in order 
to facilitate full participation of all members at meetings. 


“Tt is the policy of the Institute of Mathematical Statistics that all its meetings shall be 
held on a completely nonsegregated basis. In particular, prior to determining the place of 





194 REPORT OF THE PRESIDENT 


a forthcoming meeting, the Secretary of the IMS shall ascertain that meeting halls, 
eating facilities and housing accommodations adequate for the expected attendance 
will be available on a nonsegregated basis, and that all social events connected with the 
meetings shall be nonsegregated.”’ 


There has been considerable discussion as to whether such a policy is appro- 
priately incorporated in the Bylaws or should remain as a policy statement of 
the Council. Also, there has been considerable discussion of the circumstances 
that constitute compliance with the policy. A committee consisting of William 
G. Cochran, Chairman, Henry Scheffé, and S. S. Wilks, has made a study and 
submitted a report to the Council. On the basis of this report and additional dis- 
cussion the Council unanimously reaffirmed the policy adopted at Kingston and 
recommended that the membership adopt a resolution supporting the Council 
position. In addition, it recommended that action on incorporating such a policy 
statement in the Bylaws be deferred to the September 1954 meeting so that mem- 
bers will have adequate opportunity to vote by mail or in person after receiving 
the benefit of study and advice by the Council. These Council recommendations 
were favorably acted upon at the membership meeting. The consequence is that 
the Council policy as adopted at Kingston is reaffirmed, while the desirability of 
incorporating such a policy statement in the Bylaws will receive additional con- 
sideration. 

The two persons who carry the heaviest load of Institute work are the Secre- 
tary-Treasurer, K. J. Arnold, and the Editor, Erich Lehmann. In addition, a 
heavy load of work is carried by the Editorial Committee, the Executive Com- 
mittee, and a number of others. I should like to mention particularly Leo Good- 
man who served as Program Coordinator and also as Chairman of the Program 
Committee responsible for organizing the December meeting at Washington, 
D. C.; D. B. DeLury who was the Program Committee Chairman for the Sep- 
tember meeting in Canada, at Kingston, Ontario; Ralph A. Bradley, who served 
as Chairman of the Eastern Regional Program Committee that organized the 
May meeting in Washington, D. C.; and A. H. Bowker who served as Chairman 
of the Western Regiona! Program Committee, which was responsible for organiz- 
ing the West Coast meeting at Stanford in June. Associate Secretary Lionel Weiss 
and Assistant Secretaries Glenn Burrows, G. L. Edgett, Harold Nisselson, and 
Rosedith Sitgreaves, carried the heavy load on physical arrangements for the 
meetings, as well as participating in the regular work of the Program Committees. 

The Membership Committee, with W. D. Baten as chairman, has initiated a 
program of canvassing and inviting graduate students of statistics to become 
members. An excellent response has been received. This seems to be especially 
worthwhile because new members who arc likely to have a lasting interest in the 
Institute are recruited. 

The activities of the Committee on Academic Institutional Memberships, with 
T. A. Bancroft as chairman, resulted in two new Institutional members. 

The extent of participation of many members in activities supporting the work 
of the Institute is indicated by the committees that have served during the year. 





REPORT OF THE PRESIDENT 195 
A list of the names and membership of the committees is appended, and I wish 
to express the indebtedness of the Institute to the Chairmen and members for 
the work they have done. 

The Rietz memorial lecture was authorized for this year and Harald Cramér ac- 
cepted the invitation from the Institute to deliver the lecture. It was presented 
at the joint meetings with the American Statistical Association and other societies 
in Washington, D. C. in December. 

The following members of the Institute were recommended by the Committee 
on Fellows and have been elected as Fellows by the Council: 


R. G. D. Allen 
Herman Chernoff 
Leo A. Goodman 


J. C. Kiefer 
J. Marschak 
D. van Dantzig 


Max Woodbury 


The following persons have agreed to serve on the Nominating Committee for 
next year: 


Max Woodbury, Chairman 
Walter Bartky 

Ralph A. Bradley 

Edward Paulson 

George M. Kuznets 


Committees of the Institute, 1953 


Term expires 1955 
W. G. Madow 
Edward Paulson 
(d) Associate Secretaries 
W. H. Kruskall Lionel Weiss 
(e) Associate Treasurer 
E. 8. Pearson 


1. The Council and Committees of the 
Council 
(a) Members of the Council 
Term expires 1953 Term expires 1954 
Harald Cramér A. H. Bowker 
A. M. Mood T. C. Koopmans 
Jerzy Neyman H. E. Robbins 
S. 8. Wilks H. G. Romig 
Term expires 1955 
W. G. Cochran 
Churchill Eisenhart 
Henry Scheffé 
J. W. Tukey 


(b) Executive Committee 

M. H. Hansen, 
President 

E. G. Olds, 
President-Elect 


(c) Committee on Fellows 


2. Editorial Committee 
(a) Associate Editors 
David Blackwell 
J. L. Hodges, Jr. A. M. Mood 
Wassily Hoeffding J. Wolfowitz 
(b) Cooperating Members 
Z. W. Birnbaum G. E. Noether 
R. C. Bose Edward Paulson 
Kai Lai Chung M. Peisakoff 
J. F. Daly H. E. Robbins 
J. L. Doob 5S. N. Roy 
T. E. Harris L. J. Savage 
Paul G. Hoel Herbert Solomon 
J. Kiefer Charles M. Stein 
William H. Kruskal Lionel Weiss 
Howard Levene Max A. Woodbury 
H. B. Mann 


W. G. Madow 


E. L. Lehmann, 
Editor 
K. J. Arnold, 


Sec-Treas. 


Term expires 1953 
Churchill Eisenhart, 
Chairman 


Term expires 1954 
Gerhard Tintner 
8. 8. Wilks 

Jerzy Neyman 





196 


3. Committees Related to Program 
(a) Annual Meeting—Washington 
L. A. Goodman, Joseph Berkson 
Chairman David Blackwell 
Harold Nisselson, A. M. Mood 
Asst. Sec. Howard Levene 
Elizabeth L. Scott R. L. Anderson 
(b) Summer Meeting—Kingston 
D. B. DeLury, D. J. Thompson 
Chairman B. J. Teppixg 
G. L. Edgett, G. J. Lieberman 
Asst. Sec. C. F. Kossack 
Benjamin Epstein D. A. 8. Fraser 
(c) Eastern Region 
R. A. Bradley, I. D. J. Bross 
Chairman ’. N. Hurwitz 
Lionel Weiss, H. E. Robbins 
Assoc. Sec. D. F. Votaw 
Glenn Burrows, M. A. Woodbury 
Asst. Sec. 
(d) Central Region 
P. R. Rider, 
Chairman 
W. H. Kruskal, 
Assoc. Sec. 


D. A. Darling 
Oscar Kempthorne 
Irving W. Burr 
D. Ransom Whitney 
Leo Katz 
(e) Western Region 
A. H. Bowker, 

Chairman 
W. J. Dixon 
Z. W. Birnbaum 
(f) Program Coordinator 

L. A. Goodman 


J. L. Hodges, Jr. 
P. G. Hoel 
A. M. Mood 


(The Program Coordinator is an ex- 


officio member of all program committees) 
(g) Special Invited Papers 
David Blackwell, R. A. Bradley 
Chairman P. R. Rider 
T. W. Anderson A. H. Bowker 
L. A. Goodman E. L. Lehmann 
D. B. DeLury 
* (h) West Coast Committee for Joint Meet- 
ing with AAAS in San Francisco in 
1954 
Includes West Coast Program Committee 
plus Elizabeth L. Scott, who has responsi- 


bility of organizing joint IMS and AAAS 
sessions. 


* Ad hoc committee 


REPORT OF THE PRESIDENT 


4. Promotional Committees 
(a) Membership 
W. D. Baten, D. B. DeLury 
Chairman T. N. E. Greville 
C. R. Blyth D. E. South 
(b) Institutional Members 
Academic Nonacademic 
T. A. Bancroft, H. G. Romig, 
Chairman Chairman 
A. H. Bowker C. C. Hurd 
L. A. Knowler F. E. Grubbs 
A. E. Treloar Mary N. Torrey 


5. Other Committees 
(a) Nominating Committee 
Appointed by 1952 President 
M. A. Girshick 
J. L. Hodges, Jr. Solomon Kullback 
Chairman J. F. Daly 
L. J. Savage Herbert Solomon 
Howard Levene 
(b) Rietz Lecture Committee 
J. Neyman, C. C. Craig 
Chairman Will Feller 
* (c) Professional Standards of Statisticians 
in Government Service 
W. E. Deming, E. E. Houseman 
Chairman G. M. Kuznets 
H. F. Dorn B. F. Kimball 
M. A. Bershad W. R. Pabst 
Samuel Weiss Churchill Eisenhart 
* (d) Editorial Committee for the Publica- 
tion of Wald’s Selected Papers 
T. W. Anderson, E. L. Lehmann 
Chairman J. L. Hodges, Jr. 
Harald Cramér A. M. Mood 
H. A. Freeman C. M. Stein 
* (e) Committee to Consider Proposal for 
Summer Statistical Institute 
J. Neyman, H. A. Meyer 
Chairman Herbert Solomon 
G. M. Cox R. M. Thrall 
W. G. Madow 
** (f) Committee to Explore the Desirability 
of Reducing Dues for Students and for 
Members outside Continental United 
States 
H. W. Norton, 
Chairman 
C. H. Fischer 


K. J. Arnold 
J. Neyman 
Milton Sobel 


** Committee has completed its assignment and been discharged. 





REPORT OF THE PRESIDENT 197 


* (g) Committee to Prepare Manual for * (m) Committee on Activities and Develop- 
Guidance of President ment 
K. J. Arnold, E. G. Olds A. H. Bowker, W. E. Deming 
Chairman M. H. Hansen Chairman H. O. Hartley 
P. 8. Dwyer R. C. Bose Erich Lehmann 
(h) Advisory Committee on Statistical George W. Brown __— Elli 8. Marks 
Computations J. H. Curtiss John Riordan 
Z. W. Birnbaum, Churchill Eisenhart ** (n) Committee on Bureau of Standards 
Chairman C. C. Craig Issue 1958 
A. H. Bowker John Tukey, H. Robbins 
** (i) Committee on Life Membership Rates Chairman W. A. Shewhart 
C. H. Fischer F.C. Mosteller 8. 8S. Wilks 
** (j) Committee on Delegates to Interna- (0) Committee on Exchanges 
tional Congress of Mathematicians Paul S. Dwyer, K. J. Arnold 


1954 : 
Mina Rees, J. Neyman re > &) Seana 


. ; . : 
Sin hieiens J. Wolfowitz (p) Committee on Scheduling Winter 


Meetings 
Gertrude Cox Harald Cramér 7 
(k) Finance Committee C. C. Craig, R. Murphy 
Mortimer Spiegelman, Carl Fischer Chairman C. Kossack 
Chairman K. J. Arnold R.L. Anderson J. L. Hodges, Jr. 
* (1) Committee to Consider two Amend- *(q) Committee on Committee Procedures 
ments to the Bylaws H. Robbins, K. J. Arnold 
W.G. Cochran, Henry Scheffé Chairman Z. W. Birnbaum 
Chairman 8. 8. Wilks J. F. Curtiss H. W. Norton 


Representatives of the Institute for 1953 


T'o the American Association for the Advancement of Science 

Harold Hotelling (Term expires 1954) 

T'o the National Research Council, Division of Mathematics 

8. 8. Wilks (Term expires June 1954) 

To the Mathematical Policy Committee 

Mina Rees (Term expires 1953) 

To the Joint Committee for Development of Statistical Applications in Engineering and Manu- 
facturing 

A. H. Bowker (Term expires 1954) 

On the Committee on Mathematical Training of Social Scientists 

W. G. Madow and T. W. Anderson 

To advise National Science Foundation regarding Classification List of Specialties in Mathe- 
matics 

B. J. Tepping (Term expires on completion of specific assignment) 

To the American Academy of Political and Social Sciences 

Max A. Woodbury and John Mauchly 


December 29, 1953 Morris H. HANsEeN 
President 





REPORT OF THE SECRETARY-TREASURER 


REPORT OF THE SECRETARY-TREASURER OF 
THE INSTITUTE FOR 1953 


At the end of 1953 the Institute had 1399 individual members and 9 institu- 
tional members. The Laboratory of Statistical Research, University of Washing- 
ton was welcomed as a new institutional mermber during the year. The Statistical 
Laboratory of the State College of lowa and Michigan State College, Department 
of Mathematics will be welcomed to institutional membership on January 1, 
1954. The year 1952 ended with 1280 individual members. During 1953, 157 new 
members were accepted, 28 former members were reinstated, 33 members re- 
signed, 31 were dropped for non-payment of dues and 2 died. 

Meetings 55, 56, 57 (15th Summer) and 58 (16th Annual) were held during 
1953. Reports of the first three have appeared in the Annals. The report of the 
Annual Meeting will appear in the March, 1954 issue of the Annals. On behalf of 
the Institute, the Secretary wishes to express appreciation to all who were instru- 
mental in making each of these meetings eminently successful. The Program 
Committee for each meeting is listed in the report of the President. The duties of 
an Associate Secretary for the meeting in Washington, D. C., on April 29-30 were 
performed by R. A. Bradley, Program Chairman and for the meeting in Stanford, 
California, on June 19-20, by Rosedith Sitgreaves, Assistant Secretary. 


INSTITUTE OF MATHEMATICAL STATISTICS 
Statement of Condition 


December 31, 1953 


Dues Receivable ; 

Accounts Receivable, Back Issues 

U. 8. Government Bonds ; 

Certificates of Deposit ; pexvees 20,000 .00 
Inventory of Back Issues 18, 588.37 


Total assets Boe sree . $62,357.33 


LIABILITIES AND RESERVES 


Amount Due Printer for December Annals (Estimate) 
Withholding Tax Payable 

Miscellaneous Liabilities 

Reserve for Dues Advanced 

Reserve for Subscriptions Advanced 

Reserve for Life Memberships 

Reserve for Biometrika Subscriptions 

Reserve for Maintaining Supply of Back Issues 


Total Liabilities and Reserves sie: 34,731.24 


Surplus (Excess of Assets over Liabilities and Reserve 27 , 626.09 





REPORT OF THE SECRETARY-TREASURER 


Revenues 
Dues Revenue 


Revenue and Expense Statement 


January 1, 


Subscriptions Revenue. 


1953 to December 31, 1953 


$13,647 .00 
9, 242.65 


Revenue from Sale of Back Issues 
Cost of Back Issues Sold 


Net Revenue from Sale of Back Issues 4,503.26 
Interest in Investments ‘ ik 325 .00 
Miscellaneous Revenue 131.36 $27,849.27 


Expenses 
Current Annals 


$11,449.68 
1,111.77 


Current Annals to inventory 


Net cost of current Annals - $10,337.91 
Miscellaneous Printing, Stationery, Postage 1,921.99 
Salary ; 3,180.00 
Miscellaneous Office Expense “y 478.63 
Contributions to American Mathemat ical Society 133.72 
Editorial Expenses : 26 
Meeting Expenses 

President’s Expenses Account 


Travel Expense. 16,444.14 


Excess of Revenue over Expense. 11,405.13 


Transfer from Surplus to Reserve for Maintaining Supply of 


Back Issues 1,221.00 


10,184.13 
17,441.96 


Increase in Surplus : 
Surplus, December 31, 1952 


Surplus, December 31, 1953 $27 , 626.09 


The financial statements this year introduce some changes in accounting pro- 
cedure intended to more accurately reflect the policy of the Institute to keep back 
issues of the Annals continuously in stock. These changes are discussed below. 

The surplus account has again increased although the amount of increase is 
not quite as great as in 1952. The increase in the size of the Annals and the con- 
sequent increase in cost has been partially offset by an increase in receipts from 
dues and subscriptions. The Editor has warned that Volume 25 (1954) can be 
expected to be larger still. The expense of publishing a directory in 1953 has also 
contributed to cutting down the rate of increase of surplus. A directory contain- 
ing more information about each member is scheduled for 1954. Lower rates for 
dues for students and for members outie the United States and Canada become 
effective in 1954. The income from investments (other than our investment in 
back issues) has increased and can be expected to increase even more in 1954. 

The new asset account, Inventory of Back Issues, is the result of Council action 





200 REPORT OF THE EDITOR 


in December, 1952. The Statement of Condition has regularly carried a footnote 
in which an estimate of the value of inventory has been given. The Council di- 
rected that an estimate of the value of inventory appear in the Statement of 
Condition. Other changes in the form of the financial statements are correlative. 
The encumbrance of a large fraction of our net worth by the policy of maintain- 
ing a supply of back issues has been recognized by setting up a reserve account. 
The Revenue and Expense Statement now recognizes depletion of inventory as 
a current expense which, along with storage costs, service charges, postage, and 
binding expenses, have been subtracted from gross income from sale of back 
issues to obtain net income for the revenue item. 

A Finance Committee with Mortimer Spiegelman as chairman was appointed 
late in 1953 and has, in addition to giving advice to the Council on policy with 
regard to investments, been of great assistance to the Treasurer in the day-to-day 
problems of carrying out the policy determined by the Council. The results of 
the actions of the Finance Committee will undoubtedly show clearly in the finan- 
cial statements at the end of 1954. K. J. ARNOLD 


Secretary-Treasurer 
December 29, 1953 


rrr 


REPORT OF THE EDITOR OF THE ANNALS FOR 1953 


During 1953 the rate of submission of new manuscripts continued to increase. 
The publication of 44 papers and 17 notes, together with the usual reports, ab- 


stracts, and news and notices brought the 1953 volume to the new high of 708 
pages. In addition, during the year a backlog has built up, equivalent to about 
100 printed pages. A corresponding expansion of the Annals seems highly de- 
sirable. 

Another Special Invited Paper, ‘‘Distribution-free tests of fit,’”’ by Z. W. Birn- 
baum, was published this year, the third one to appear so far. The Committee on 
Special Invited Papers has now functioned for five years. As a cumulative result 
of its activities during this time an increase in the number of these valuable ex- 
pository papers may be expected starting with the coming year. 

The Editor wishes to thank the previous editor, Professor T. W. Anderson, 
for his assistance in many editorial problems during this first year of operation of 
a new board. In particular, an informal manual on current editorial procedure, 
prepared by Professor Anderson, was of invaluable help. 

On behalf of the Editorial board, the Editor takes the opportunity to acknowl- 
edge the generous refereeing assistance of the following: G. E. Albert, R. L. 
Anderson, Kenneth Arrow, Julius Blum, C. R. Blyth, A. H. Bowker, K. A. Bush, 
Herman Chernoff, W. H. Clatworthy, A. C. Cohen, Jr., W. 8. Connor, C. C. 
Craig, George Dantzig, D. A. Darling, W. J. Dixon, Monroe Donsker, Meyer 
Dwass, P. R. Dwyer, D. A. 8. Fraser, M. A. Girshick, Leo Goodman, Ulf Gren- 
ander, John Gurland, Alan Hoffman, Harold Hotelling, 8. L. Isaacson, T. A. 





PUBLICATIONS RECEIVED 201 


Jeeves, N. L. Johnson, Solomon Kullback, L. M. LeCam, Michel Loéve, Eugene 
Lukacs, F. J. Massey, Jr., M. R. Mickey, Lincoln Moses, Frederick Mosteller, 
C. R. Ohman, Roy Radner, Murray Rosenblatt, H. L. Royden, Henry Scheffé, 
Elizabeth Scott, Seymour Sherman, 8. 8. Shrikhande, Rosedith Sitgreaves, J. L. 
Snell, Milton Sobel, George Steck, W. F. Taylor, J. W. Tukey, J. E. Walsh, 
Wolfgang Wasow, L. H. Wegner and 8. 8. Wilks. 

The Editor is greatly indebted to Mrs. K. Wehner who prepared the manu- 
scripts for the printer, provided other editorial assistance, and carried out most 
of the secretarial work. 

E. L. LEHMANN 


Editor 
December 29, 1953 


(eR 


PUBLICATIONS RECEIVED 


Anuario Estadistico de Espafia, Instituto Nacional de Estadistica, 1953, 870 pp. 

Brazz1, ALBERTA, Le Teorte Economiche dei Costi Comuni, and Frexice Vincr, Sut Fonda- 
menti dell’ Economica, Vol. 18, Universita degli studi di Milano, Istituto di Scienze 
Economiche e Statistiche, Milan, 1953, 45 pp., 500 lire. 

Ke.is, Lyman M., Elementary Differential Equations, 4th ed., McGraw-Hill Book Co., 
New York, 1954, x + 266 pp., $4.00. 

Paice, L. J. anp OLGA Taussky, editors, Simultaneous Linear Equations and the Determina- 
tion of Eigenvalues, National Bureau of Standards, Applied Mathematics Series 29, 
U. 8. Government Printing Office, Washington, D. C., 1953, iv + 126 pp., $1.50. 


Vinct, Fe.ice, Statistica ed Economica nella nostra Enciclopedia Matematica, Vol. 17, Uni- 
versité degli studia di Milano Istituto di Scienze Economiche e Statistiche, Milan, 
1953, 25 pp. 400 lire. 








ESTADISTICA 


Journal of the Inter American Statistical Institute 


Vol. XI, No. 40 September 1953 
Contents 


Las Estadisticas del Turismo: Problemas Basicos y Situacién en los Paises Americanos 
ALBERTO VesPpREMY BANGHA 
A National Identity Registration System to Synthesize Social Statistics 
HAsert L. Dunn 
M todos y Problemas del Primer Censo de Comercio e Industrias de Costa Rica, 
1950-1981 hee ....Wrtsurc Jmwénez Castro 
E] Indice Canadiense de Precios al Consuntider (Traducci6n) 
DomInton BurEAU OF STATISTICS 
Special Vegetable Surveys in the New York City Market Area. . Irvin Hotmes 
Nuevos Procedimientos Censales en el Canad4 (Traduccién)..... NATHAN KeEyrirz 
La Estadistica como una Carrera... . .....Howarp L. Jones y Harry V. Ropervs 
Los Censos Industriales en la Naciones Americanas..............Rose W. Burton 
Computing Charts for Social Security Sa de Computaci6n para los 
Beneficios de la Seguridad Social. . . . ey UGENE A. RASOR 
Center of Population of the United States, 1950—Centro de Poblacién de los Estados 
Unidos, 1950 ..U. S. Bureau or THe Census 
Institute Affairs Statistical News Publications 
Published quarterly; annual subscription price $3.00 (U.S.); single copies $1.00 (U.S.) 
Inter American Statistical Institute, % Pan American Union, Washington 6, DC. U.S. A. 


JOURNAL OF THE 
AMERICAN STATISTICAL ASSOCIATION i iti 


1108 16th St., N.W. Washington 6, D. C. VOL. 48 NO. 264 


Recent Advances in Finding Best Operating Conditions. . ...R. L. ANDERSON 
The Problem of Autocorrelation in Regression Analysis...................R. L. ANDERSON 
Percentage Points of the Incomplete Beta Function...... cas tats Rosert E. CLARK 
Statistical Problems of the Kinsey Report 
Wituram G. Cocuran, FREDERICK MOSTELLER, AMD JoHN W. TUKEY 
On A Probability Mechanism to Attain Economic Control of the Resultant Error of Response 
and the Bias of Nonresponse. .. ....W. Epwarps Demine 
A Note on Regression When There Is Extraneous Information about One of the ie 
URBIN 
Census Tracts and Urban Research Donan L. FoLey 
The Mathematical Basis of the Bean Method of Graphic Multiple Correlation 
Ricwarp J. Foote 
A Hollerith Technique for the Solution of Normal Equations 
M. J. R. Hearty anv G. V. DyKe 
The Inventory Problem .. SEBASTIAN LITTAVER 
Truncated Poisson Distributions. .... ....Paut R. Rwer 
Effect of Weighting by Card-Du lication on n Efficiency of Survey Results. . Irvine ROSHWALB 
Tables of Expected Values of 1/X for Positive Bernoulli and Poisson Variables 
Epwin L. Gras AND I. RicHARD SAVAGE 
Bibliography of Nonparametric Statistics and Related Topics.......I. Richarp SAVAGE 
The Use of Runs to Contro! the Mean in Quality Control... . ; H. WEILER 


THE AMERICAN STATISTICAL ASSOCIATION INVITES 


AS MEMBERS ALL PERSONS INTERESTED IN: 
1. Development of new theory and method 
2. Improvement of basic statistical data 


3. Application of statistical methods to practical problems. 





ECONOMETRICA 


Journal of the Econometric Society 
Contents of Vol. 21, No. 4 - October, 1953 


M. Atuais....Le Comportement de l’Homme Rationnel devant le Risque: Critique 
des Postulats et Axiomes de |’Ecole Americaine 


Karu A. Fox....A Spatial Equilibrium Model of the Livestock-Feed Economy in 
the United States 
Wavrer D. Fisner 
On a Pooling Problem from the Statistical Decision Viewpoint 
A. Dvorerzxy, J. Kierer, anp J. WoLrow1tTz 
On the Optimal Character of the (s,S) Policy in Inventory Theory 
Gerarp Desrev AaNp I. N. HersTein........... Nonnegative Square Matrices 
Davip C. McGarvey.........A Theorem on the Construction of Voting Paradoxes 
Report of Lucknow Meeting 
Book Reviews, Communications, Announcements, and Notes 


Published Quarterly Subscription rates available on request 
The Econometric Society is an international society for the advancement of economic theory in its 
relation to statistics and mathematics 
Subscriptions to Econometrica and inquiries about the work of the Society and the procedure in applying 


for membership should be addressed to Rossen L. Cardwell, Secretary, The Econometric Society, The 
University of Chicago, Chicago 37, Illinois, U. 8. A. 





BIOMETRIKA 


Volume 40 Contents Parts 3 and 4, December 1953 


The population frequencies of species and the estimation of population parameters. By I.J. GOOD. Cap- 
ture-recapture analysis. By J. M. HAMMERSLEY. The use of chain-binomials with a variable chance of 
infection for the analysis of intra-household epidemics. By NORMAN T.J. BAILEY. Spread of diseases 
in a rectangular plantation with vacancies. By G. H. FREEMAN. Tests of significance for concurrent 
regression lines. By E. J. WILLIAMS. Approximate confidence intervals. II. More than one unknown 
parameter. By M.S. BARTLETT. Non-normality and tcsts on variances. By G.E.P.BOX. Approxzi- 
mating to the distributions of measures of dispersion by a power of x*. By J. H.CADWELL. The power 
function of some teste based on range. By H.A. DAVID. Some simple approximate tests for Poisson vari- 
ates. By D.R.COX. Orthogonal polynomial fitting. By JOHN WISHART and THEOCHARIS META- 
KIDES. Population differences between species growing according to simple birth and death processes. 
By J.H. DARWIN. Modifications to the variate-difference method. ByM.H.QUENOUILLE. Moments 
of the rank correlation coefficient * in the general case. By R.M. SUNDRUM. 99.9% and 0.1% points of 
the x*-distribution. By T. LEWIS. Tables of Symmetric Functions, Part IV. By F. N. DAVID and 
M. G. KENDALL. Some procedures for comparing Poisson processes or populations. By ALLAN 
BIRNBAUM. Scale factors and degrees of freedo: for small sample sizes for x-approximation to the 
range. By GEORGE WM. THOMSON. The third moment of Gini’s mean difference. By A. R. 
KAMAT. A method of systematic sampling based on order properties. By R.M.SUNDRUM. A note 
on ordered least-squares estimation. By F. DOWNTON 


The subscription price, payable in advance, is 45s. inland, 54s. export (per volume including postage). Cheques 
should be drawn to Biometrika and sent to “The Secretary, Biometrika Office, Department of Statistics, 
University College, London, W.C. 1.” All foreign cheques must be in sterling and drawn on a bank 
having a London agency. 








MATHEMATICAL REVIEWS 


A journal containing reviews of the mathematical liler- 
ature of the world, with full subject and author indices 


Publication of this journal is sponsored by the American Mathe- 
matical Society, Mathematical Association of America, Institute of 
Mathematical Statistics, London Mathematical Society, Edinburgh 
Mathematical Society, Union Matematica Argentina, and others. 


Subscriptions accepted to cover the calendar year only. 
Issues appear monthly except July. $20.00 per year. 


Send subscription order or request for sample copy to 


AMERICAN MATHEMATICAL SOCIETY 
80 Waterman Street, Providence 6, Rhode Island 


JOURNAL OF THE 
ROYAL STATISTICAL SOCIETY 


Series B ( Methodological) 
Vol. XV, No. 1, 1953 


F. J. ANSCOMBE..... uential Estimation (with Discussion) 
D. V. Linpiey ... . Statistical Inference (with Discussion) 
P. A. P. Moran... . The Random Division of an Interval, Pt. III 
G. H. Jowett AnD J. F. Scorr Simple Graphical] Techniques for Calculating Serial 

and Spatial Correlations, and Mean-Squared Differences 
J. O. IrwN 


On the “Transition Probabilities” Corresponding to any Accident Distribution 
P. ARMITAGE.... A Note on the Time-homogeneous Birth Process 
K. SINGER _. Application of the Theory of Stochastic Processes to the Study of Irrepro- 
ducible Chemical Reactions and Nucleation Processes 

M. S. BarTLett AND D. V. RAJALAKSHMAN 
Goodness of Fit Tests for Simultaneous Autoregressive Series 
P. WHITTLE i Avawbh's The Analysis of Multiple Stationary Time Series 

J. D. SARGAN 

An Approximate Treatment of the Properties of the Correlogram and Periodogram 


The Royal Statistical Society, 4, Portugal Street, London, W.C.2. 





SANKHYA 


The Indian Journal of Statistics 
Edited by P. C. Mahalanobis 
Vol. 12, Part 4, 1953 


. C. MAHALANOBIS “ 
Some Observations on the Process of Growth of National Income 
. B. 8. HaLtpange The Estimation of Two Parameters from a Sample 
. L. MaLitows . Sequential Discrimination 
). RADHAKRISHNA Rao 
On Transformations Useful in the Distribution Problems of Least Squares 
A. P. CALDERON AND H. B. Mann On the Moments of Stochastic Integrals 
Tore DaALeNIvus ; The Economics of One-State Stratified Sampling 
K. C. Sear On Certain Extended Cases of Double Sampling 
J. M. SENGupPTA 
Significance Level of >X?/(2X)? Based on Student’s Distribution 
H. Sinwa Whither Statistics? 
Indian Statistical Institute: Twentieth Annual Report: 1951-52 


ANNUAL SusBscripTion: 30 rupees ($10.00), 10 rupees ($3.50) per issue. 
Back Numpers: 45 rupees ($15.00) per volume; 12/8 rupees ($4.50) per issue. 
Subscriptions and orders for back numbers should be sent to 
STATISTICAL PUBLISHING SOCIETY 
204/1 Barrackpore Trunk Road Calcutta 35, India 





SKANDINAVISK 
AKTUARIETIDSKRIFT 


1953 - Parts 1 - 2 
Contents 


G. ARFWEDSON Research in collective risk theory. The case of equal risk sums 
CaRL-ERIK gy mee The Distribution of the Partial Correlation Coefficient 
in Samples from Multivariate Universes in a Special Case of Non-normally 
Distributed Random Variables 
D. FE. Barton On Neyman’s Smooth Test of Goodness of Fit and its Power 
with Respect to a Particular System of Alternatives 
ERLING SVERDRUP Similarity, Unbiasedness, Minimaxibility and 
Admissibility of Statistical Test Procedures 
ARNE JENSEN Markoff Chains as an Aid in the Study of Markoff Processes 
T. DaLEenivus The Multi-variate Sampling Problem 
Litteraturanmilan 
Eksamen i Forsikringsvidenskab og Statistik ved Kgbenhavns Universitet 
Oversikt av utlindska aktuarietidskrifter 
De skandinaviske aktuarforeningers virksomhed i 1952 
XIVth International Congress of Actuaries, Madrid, 1954 


Annual subscription: $5.00 per year 
Inquiries and orders may be addressed to the Editor, 
GRANHALLSVAGEN 35, STOCKSUND, SWEDEN 





it 


2 


=, 


gst: 


